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FOREWORD 


Opinions,  interpretations,  conclusions  and  recommendations  are  those  of  the  author 
and  are  not  necessarily  endorsed  by  the  U.S.  Army. 


Where  copyrighted  material  is  quoted,  permission  has  been  obtained  to  use 
such  material. 

Where  material  from  documents  designated  for  limited  distribution  is  quoted, 
permission  has  been  obtained  to  use  the  material. 

Citations  of  commercial  organizations  and  trade  names  in  this  report  do  not 
constitute  an  official  Department  of  Army  endorsement  or  approval  of  the 
products  or  services  of  these  organizations. 

In  conducting  research  using  animals,  the  investigator(s)  adhered  to  the 
"Guide  for  the  Care  and  Use  of  Laboratory  Animals,"  prepared  by  the 
Committee  on  Care  and  use  of  Laboratory  Animals  of  the  Institute  of 
Laboratory  Resources,  national  Research  Council  (NIH  Publication  No.  86-23, 
Revised  1985). 

For  the  protection  of  human  subjects,  the  investigator(s)  adhered  to  policies 
of  applicable  Federal  Law  45  CFR  46. 

In  conducting  research  utilizing  recombinant  DNA  technology,  the 
investigator(s)  adhered  to  current  guidelines  promulgated  by  the  National 
Institutes  of  Health. 

In  the  conduct  of  research  utilizing  recombinant  DNA,  the  investigator(s) 
adhered  to  the  NIH  Guidelines  for  Research  Involving  Recombinant  DNA 
Molecules. 

In  the  conduct  of  research  involving  hazardous  organisms,  the  investigator(s) 
adhered  to  the  CDC-NIH  Guide  for  Biosafety  in  Microbiological  and 
Biomedical  Laboratories. 
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B.  INTRODUCTION 

In  clinical  follow-up  studies,  subjects  are  monitored  at  regular  time  intervals  for  a 
physical  condition.  It  is  often  the  case  that  an  event  under  observation  can  take  place  in 
between  two  successive  visits,  and  it  may  not  be  possible  for  the  subject  to  know  the  time 
to  such  an  event  exactly.  For  example,  consider  the  situation  in  which  a  group  of  women 
at  high  risk  for  breast  cancer  is  asked  to  take  a  chemopreventive  substance  for  a  fixed  time 
period.  At  the  end  of  the  period,  each  participating  woman  is  required  to  submit  a  blood 
or  urine  sample  at  regular  intervals  in  order  to  monitor  the  level  of  a  validated  intermediate 
biomarker.  Let  X  denote  the  time  from  cessation  of  use  of  the  agent  to  the  loss  of  its 
protective  effect,  quantified  as  a  return  to  baseline  value  of  the  biomarker.  If  a  woman 
submits  a  sample  for  assay  on  a  daily  basis,  the  value  of  X  can  be  observed  exactly,  unless 
the  protective  effect  is  still  present  by  the  time  the  study  is  terminated  so  that  X  is  right 
censored  in  the  usual  sense  of  survival  analysis.  In  practice,  however,  the  follow-up  interval 
can  be  a  week  or  longer;  therefore  the  exact  value  of  X  is  generally  unknown  but  is  known  to 
lie  between  the  time  points  L  and  i?,  where  L  is  the  number  of  days  from  cessation  of  agent 
intake  to  the  last  time  the  sample  was  assayed  and  the  protective  effect  was  still  present,  and 
R  is  the  number  of  days  from  cessation  of  agent  intake  to  the  most  recent  time  the  sample 
was  assayed.  If  the  protective  effect  is  still  present,  then  R  takes  the  value  infinity.  In  any 
case,  when  the  value  of  X  is  only  known  to  lie  between  (L,  R),  we  say  that  X  is  censored  in 
the  interval  (L,R).  Therefore  the  observed  data  consist  of  either  censoring  intervals  ( L,R ) 
or  exact  observations  X  =  L  —  R. 

Our  research  project  is  concerned  with  nonparametric  estimation  of  the  distribution 
function  F(t)  =  Pr(X  <  t)  of  a  real-valued  random  variable  X,  or  equivalently  its  survival 
function  S(t)  =  1  —  F(t),  when  the  sample  data  are  incomplete  due  to  restricted  observation 
brought  about  by  interval  censoring.  Generalized  maximum  likelihood  (GML)  method  in 
the  sense  of  Kiefer  and  Wolfowitz  [1]  is  the  standard  practice  of  estimating  S.  At  present, 
there  are  two  iterative  computation  procedures  that  will  yield  the  GML  estimate  (GMLE) 
of  S  at  convergence.  The  first  one  is  due  to  Peto  [2]  and  makes  use  of  the  Newton’s  method. 
The  second  is  due  to  Turnbull  [3]  and  makes  use  of  a  simpler  but  slower  algorithm  called  self- 
consistent  algorithm.  A  solution  to  this  algorithm  is  also  called  a  self-consistent  estimator 
(SCE). 

Because  there  is  no  closed-form  expression  for  the  GMLE  of  5,  it  has  been  difficult  to 
study  its  asymptotic  statistical  properties,  including  consistency,  normality  and  efficiency. 
Such  a  setback  in  the  statistical  development  of  the  GMLE  has  severely  limited  its  use  in 
the  statistical  analysis  of  interval-censored  (IC)  data. 

Before  we  began  our  funded  Army  research,  we  had  extended  Efron’s  redistribution-to- 
the-right  idea  for  right-censored  data  [4]  and  proposed  a  redistribution-to-the-center  (RTC) 
method  to  yield  a  nonparametric  estimator  of  S  which  are  called  RTC  estimate  (RTCE). 
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Such  an  estimator  has  a  closed-form  expression  and  can  be  readily  calculated  for  IC  data 
of  any  dimension.  IC  data  are  said  to  satisfy  DI  (disjoint  or  included)  condition  if  for  every 
two  censoring  intervals,  either  they  are  disjoint  or  one  is  a  subset  of  the  other.  For  instance, 
in  a  clinical  study  in  which  every  subject  has  the  same  follow-up  schedule,  say  at  time  point 
ai,  a,2,  ...,  a,k,  then  {L,  R}  =  {0,  ai},  or  {aj,aj+i}  or  {aj,oo}.  A  sample  of  such  IC  data 
...,  {Ln,Rn}  will  satisfy  DI  condition.  We  had  shown  that  under  DI  condition, 
RTCE  is  actually  GMLE  itself.  This  important  observation,  together  with  the  availability 
of  an  explicit  expression,  had  motivated  us  to  submit  the  present  proposal  on  RTCE  to  the 
Army. 

In  our  first  year  of  research,  we  completed  our  research  for  Task  1  and  Task  2  in 
the  Statement  of  Work  for  RTCE.  However,  we  also  discovered  that  in  the  case  of  non- 
DI  data,  RTCE  may  be  different  from  GMLE,  and  RTCE  is  not  always  consistent.  The 
interesting  and  intriguing  observation  is  that  the  difference  between  RTCE  and  GMLE  is 
small,  at  least  based  on  our  limited  simulation  studies  [5].  In  establishing  consistency  result 
for  RTCE  under  DI  condition,  we  had  gained  important  insight  into  proofs  of  asymptotic 
properties  for  GMLE,  which  does  not  possess  a  closed-form  expression.  Because  GMLE  is 
the  preferred  estimator  for  S ,  we  decided  to  focus  our  attention  on  GMLE  instead  of  RTCE 
for  the  remainder  of  the  funded  research,  and  we  have  successfully  completed  all  the  tasks 
stated  in  the  Statement  of  Work  for  GMLE. 

Our  research  was  then  extended  to  study  the  statistical  inferences  with  multivariate 
interval-censored  data,  which  may  also  occurred  in  breast  cancer  research  and  Cox  regression 
models.  Some  results  have  been  obtained  in  these  respects. 

C.  BODY 

C.l.  Basic  setup 

Interval-censored  data  can  arise  in  the  following  four  situations: 

1.  Case  2  IC  data  (C2  data)  consist  of  right-censored  (R  =  oo),  left-censored  (L  =  0)  and 
strictly  interval-censored  observations  (0  <  L  <  R  <  oo).  These  are  by  far  the  most 
common  type  of  IC  data  in  clinical  follow-up  studies. 

2.  Mixed  IC  data  (MIC  data)  consist  of  both  C2  data  and  exact  observations  (L  =  R). 
Yu,  Li  and  Wong  [6]  presented  an  example  involving  MIC  data  from  a  breast  cancer 
follow-up  study. 

3.  Case  1  IC  data  (Cl  data))  consist  of  either  right-censored  or  left-censored  observations. 
For  example,  when  an  animal  is  sacrificed  for  inspection  of  a  tumor  formation,  time  to 
appearance  of  the  tumor  is  Cl  interval  censored.  Examples  of  Cl  data  can  be  found  in 
[7]  and  [8]. 

4.  Doubly-censored  data  (DC  data)  consist  of  right-,  left-censored  and  exact  observations. 
An  example  with  DC  data  is  given  in  [9]. 
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We  have  formulated  four  different  interval  censorship  models  corresponding  to  the  four 
IC  data  types.  To  study  the  asymptotic  properties  of  the  GMLE,  we  make  use  of  the 
following  assumptions: 

(AS1)  The  censoring  distribution  is  discrete  but  the  survival  distribution  is  arbitrary. 
(AS2)  The  support  set  of  the  censoring  vector  is  finite,  but  the  survival  distribution  is 
arbitrary. 

(AS3)  A  probability  restriction.  See  Section  C. 

(AS4)  A  probability  restriction.  See  Section  C. 

(AS5)  The  censoring  distribution  and  the  survival  distribution  are  arbitrary,  but  have 
to  satisfy  some  regularity  conditions,  stated  in  Gu  and  Zhang  [10]. 

C.2.  Case  1  model 

Case  1  model  for  Cl  data  assumes  that  the  survival  time  X  and  a  random  inspection 
time  Y  are  independent.  We  always  observe  Y.  However,  X  is  not  fully  observed  except 
that  we  know  that  either  X  <  Y  or  X  >  Y.  Under  assumption  AS1,  we  have  shown  that 
GMLE  is  strongly  consistent,  asymptotically  normal  and  asymptotically  efficient  at  all  the 
inspection  times.  The  results  are  published  in  Yu,  Schick,  Li  and  Wong  [11]. 

C.3.  Case  2  model 

The  C2  model  for  C2  data  assumes  that  X  and  the  random  censoring  vector  (Y,  Z)  are 
independent  and  that  Y  <  Z  with  probability  one.  We  do  not  observe  X  except  that  we 
know  X  is  before  Y,  or  between  Y  and  Z,  or  after  Z.  We  state  an  assumption  for  C2  model 
as  follows: 

(AS3)  P{Y  G  R  fl  Ij}  >  0  for  any  two  realizations  of  (L,  R),  (Li,  Ri)  =  R  and  ( Lj,Rj )  — 
Ij,  provided  R(1  Ij  ^  0. 

Under  the  assumption  AS1,  we  have  shown  that  GMLE  is  strongly  consistent.  Under 
the  assumptions  AS2  and  AS3,  we  have  shown  that  GMLE  is  asymptotically  normal  and 
efficient.  The  results  are  published  in  Yu,  Schick,  Li  and  Wong  [12]. 

C.4.  MIC  model 

Mixture  interval  censorship  (MIC)  model  for  MIC  data  assumes  that  an  IC  observation 
is  drawn  from  a  probability  mixture  of  C2  model  and  the  usual  right  censorship  model  for 
right-censored  data. 

Define  r  =  sup{t;  Pr(mm(X,  T)  <  t)  <  1},  Ty  =  sup{f;  Pr(Y  <  t)  =  0}.  and 
Tz  =  sup{f;  Pr(Z  <t)<  1}.  We  assume  that  r  >  tz .  We  state  an  assumption  for  MIC 
model  as  follows: 

(AS4)  Pr(L  =  r)  >  0  if  Pr(X  <  r)  <  1  and  Pr(I?  =  Ty)  >  0  if  Pr(X  <  Ty)  >  0. 

Under  assumptions  AS2  and  AS4,  we  have  shown  that  GMLE  is  strongly  consistent 
(Yu,  Li  and  Wong  [6]),  and  under  assumptions  AS2,  AS3  and  AS4,  GMLE  is  asymptotically 
normal  (Yu,  Li  and  Wong  [13]).  Recently,  we  have  been  able  to  establish  these  asymptotic 
properties  without  the  need  of  assumption  AS2.  A  manuscript  on  these  results  has  been 
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submitted  for  publication  (Yu,  Li  and  Wong  [14]). 

C.5.  DC  model 

The  DC  model  for  DC  data  assumes  that  X  and  a  random  vector  (Y,  Z)  are  independent 
and  Y  <  Z  with  probability  one,  and  that  X  is  uncensored  if  Y  <  X  <  Z,  right  censored  if 
Z  <  X  and  left  censored  if  X  <  Y.  Let  Sz  and  Sy  be  the  survival  functions  of  Z  and  Y, 
respectively,  and  let  K  =  Sy  —  Sz-  We  state  an  assumption  for  DC  model  as  follows: 


(AS5)  K(x— )  >  0  for  all  x  such  that  S(x)  <  1  and  S(x— )  >  0, 


We  have  shown  in  a  submitted  manuscript  (Yu  and  Wong  [15])  that  in  order  to  establish 
asymptotic  results,  GMLE  has  to  be  modified.  Under  assumptions  AS4  and  AS5  we  have 
shown  that  the  modified  GMLE  is  strongly  consistent  and  is  asymptotically  normal  and 
efficient  under  assumptions  AS3,  AS4  and  AS5. 

C.6.  Two-sample  nonparametric  test 

Based  on  the  asymptotic  results  that  we  have  established  for  different  IC  models,  we 
have  successively  derived  the  asymptotic  distribution  of  the  following  two-sample  distance 
test  statistics  for  each  model: 


D  = 


W(t)(§!(t)  -  S2(t))dt, 


where  T\  and  T2  are  specified  time  point  and  W (t)  is  a  weight  function.  A  manuscript  on 
the  asymptotic  results  of  D  is  being  prepared. 


C.7.  Proportional  hazards  model 

In  our  original  proposal,  we  had  assigned  three  months  of  time  for  Task  7  on  Cox 
regression  for  IC  data.  However,  we  have  realized  that  statistical  inference  for  the  parameter 
/3  in  Cox  regression  under  interval  censorship  is  much  more  involved  than  its  counterpart 
in  the  usual  right-censored  situation.  In  the  latter  case,  the  maximum  likelihood  estimator 
(MLE)  of  (3_  does  not  depend  on  the  baseline  survival  function  So(t)  owing  to  the  simple 
nature  of  the  partial  likelihood  approach.  However,  such  simplicity  of  likelihood  function 
does  not  carry  over  to  the  interval  censorship  model,  and  maximum  likelihood  estimation 
of  §_  will  involve  GML  estimation  of  S0(t)  at  the  same  time,  thus  resulting  in  a  difficult 
high-dimensional  estimation  problem. 

Under  the  restrictive  assumption  that  both  X  and  the  censoring  vector  take  on  finitely 
many  values,  we  have  proved  that  the  MLE  of  /3  and  the  GMLE  of  So(t),  and  hence  the 
survival  function  S(t\Z_)  =  Sq  (t)exp—  — ,  where  Z_  denotes  a  vector  of  covariates  for  Cox 
regression,  are  consistent  and  asymptotically  normal  (Li,  Yu  and  Wong  [17]).  Much  more 
effort  is  needed  to  pursue  research  on  the  asymptotic  inference  of  Cox  regression  model 
under  more  relaxed  assumptions  on  the  distributions  of  X  and  the  censoring  vector. 
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During  the  no-cost  extension  period,  we  have  devoted  our  effort  to  the  implementation 
of  a  Newton- Raphson  algorithm  for  computing  the  MLE  of  /?  and  the  GMLE  of  Sa(t). 
Although  the  algorithm  is  straightforward  to  derive  using  the  asymptotic  covariance  matrix 
which  we  have  derived  for  the  Cox  parameters,  we  soon  realized  there  are  two  difficult 
problems  associated  with  the  Newton-Raphson  algorithm.  The  first  problem  is  that  the 
algorithm  is  computationally  infeasible  for  data  of  moderate  size.  For  example,  in  the 
prognostic  analysis  of  a  breast  cancer  relapse  follow-up  study  with  n  =  374  women  which 
we  shall  describe  in  Section  C.9,  the  Newton-Raphson  algorithm  broke  down  owing  to  the 
numerical  difficulty  associate  with  inverting  a  Hessian  matrix  of  order  60.  Another  problem 
with  the  Newton-Raphson  algorithm  is  that  it  does  not  guarantee  the  strict  monotonicity 
condition  SQ(ti)  >  •••  >  S0(tm)  is  satisfied  at  each  iteration,  where  t\,  ...,  tm  are  the 
ordered  distinct  times  points.  When  this  condition  is  violated,  we  shall  have  to  re-compute 
the  estimates  by  assuming  S0(t3)  =  SQ(tk )  for  some  j  ^  k.  Since  there  are  a  maximum 
of  2m  such  possibilities,  it  will  be  computationally  infeasible  to  apply  the  Newton-Raphson 
algorithm  to  a  data  set  with  even  a  moderate  m. 

The  above  computational  problems  associated  with  the  Newton-Raphson  algorithm 
have  motivated  us  to  consider  a  two-step  estimation  approach  for  the  Cox  regression  parame¬ 
ters.  Briefly,  in  step  1,  the  regression  coefficient  are  estimated  by  a  simple  Newton-Raphson 
algorithm  through  the  device  of  a  data  grouping  scheme;  in  step  2,  the  baseline  survival 
function  is  estimated  by  a  simple  self-consistent  algorithm  based  on  the  original  data.  The 
details  of  our  novel  approach  are  contained  in  the  DOD  grant  “Cox  regression  model  for 
interval-censored  data  in  breast  cancer  follow-up  studies” ,  which  we  have  submitted  to  the 
USAMRMC  for  consideration  for  funding. 

C.8.  Computer  software 

We  have  made  it  available  to  the  public  a  set  of  computer  programs  for  calculating 
RTCE  and  GMLE,  for  carrying  out  asymptotic  inference  of  GMLE  for  all  patterns  of  interval 
censorship,  and  for  evaluating  the  Z-score  of  the  proposed  two-sample  weighted  distance  test. 
These  programs  can  be  accessed  via  the  internet  at  qyu@math.binghamton.edu. 

C.9.  Applications  to  breast  cancer  research 

We  have  applied  our  results  on  asymptotic  inference  of  GMLE  for  C2  model  to  two 
breast  cancer  research  projects.  The  first  project  is  concerned  with  a  chemoprevention 
intervention  trial  of  indole-3-carbinol  (I3C)  for  breast  cancer  which  is  being  conducted  at 
Strang  Cancer  Prevention  Center.  The  statistical  question  of  interest  is  the  estimation  of 
duration  of  sustaining  effect  of  I3C,  which  is  C2  censored.  A  preliminary  report  on  a  short¬ 
term  trial  has  recently  been  published  [18];  however,  a  longer  trial  lasting  for  more  than  one 
year  is  still  ongoing  so  that  more  informative  data  on  duration  of  sustaining  effect  can  be 
obtained. 

The  second  project  is  a  breast  cancer  relapse  follow-up  study  based  on  data  obtained 
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from  374  women  with  stages  I  -  III  unilateral  invasive  breast  cancer  surgically  treated  at 
Memorial  Sloan-Kettering  Cancer  Center  between  1985  and  1990.  The  median  follow-up 
duration  was  46  months.  Relapse  time  was  given  by  the  time  interval  between  surgery  and 
the  initial  relapse.  A  relapse  that  took  place  between  two  successive  follow-up  visits  was 
regarded  as  interval  censored.  If  a  patient  did  not  relapse  towards  the  end  of  the  study, 
then  her  relapse  time  was  right  censored.  Of  the  374  observations,  300  were  right  censored 
(no  relapse),  21  were  left  censored  and  53  were  strictly  interval  censored  (74  relapses).  Bone 
marrow  micrometastasis  (BMM)  was  determined  for  each  woman  at  the  time  of  surgery. 
An  important  question  is  whether  remission  duration  is  related  to  the  extent  of  initial 
tumor  burden  defined  as  number  of  BMM  cells  detected.  Figure  1  compares  the  relapse-free 
GMLE  curves  of  patients  with  number  of  BMM  <  14  versus  those  with  number  of  BMM 
>  14.  Our  asymptotic  two-sample  distance  test  yielded  a  P  value  close  to  0.1.  An  abstract 
on  a  detailed  prognostic  analysis  of  the  entire  data  set  using  our  asymptotic  results  on  C2 
data  was  presented  at  the  annual  San  Antonio  Breast  Cancer  Symposium  in  December  1998. 

D.  KEY  RESEARCH  ACCOMPLISHMENTS 

•  presented  a  simple  nonparametric  estimator  of  the  survival  function  called  RTCE,  which 
has  an  explicit  expression  and  which  is  equal  to  GMLE  under  some  restrictions  on  the 
interval-censored  data 

•  established  consistency,  asymptotic  normality  and  asymptotic  efficiency  of  GMLE  under 
a  variety  of  interval  censorship  models 

•  presented  an  asymptotic  two-sample  nonparametric  test  for  different  interval  censorship 
models 

•  established  consistency,  asymptotic  normality  and  asymptotic  efficiency  for  the  MLE 
of  the  regression  coefficients  and  GMLE  of  the  survival  function  at  a  given  covariate 
pattern  of  a  Cox  regression  model  under  finite  assumptions  on  the  distribution  functions 
of  both  the  survival  time  and  the  censoring  vector 

•  identified  the  computational  difficulties  associated  with  the  Newton-Raphson  algorithm 
for  computing  the  asymptotic  estimates  of  Cox  parameters 

•  pointed  out  future  directions  for  a  more  feasible  asymptotic  Cox  regression  analysis  of 
interval-censored  data 

•  made  available  to  the  public  a  set  of  computer  programs  for  calculating  RTCE  and 
GMLE,  carrying  out  asymptotic  inference  of  GMLE  and  for  evaluating  the  Z-score  of 
the  proposed  two-sample  nonparametric  test 

•  applied  the  established  asymptotic  generalized  maximum  likelihood  results  successfully 
to  a  breast  cancer  relapse  follow-up  study  with  374  women 
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E.  REPORTABLE  OUTCOMES 

•  10  published  articles: 

[a]  Li,  L.,  Watkins,  T.  and  Yu,  Q.  (1997).  An  EM  algorithm  for  smoothing  the  self- 
consistent  estimator  of  survival  functions  with  interval-censored  data.  Scandinavian 
Journal  of  Statistics.  24,  531-542. 

[b]  Yu,  Q.,  Li,  L.  and  Wong,  G.  Y.  C.  (1999).  On  consistency  of  the  self-consistent  estimator 
of  survival  functions  with  interval  censored  data.  Scandinavian  Journal  of  Statistics. 
(In  press). 

[c]  Yu,  Q.,  Schick,  A.,  Li,  L.  and  Wong,  G.  Y.  C.  (1998).  Asymptotic  properties  of  the 
GMLE  in  the  case  1  interval-censorship  model  with  discrete  inspection  times.  Canadian 
Journal  of  Statistics.  Vol.  4. 

[d]  Yu,  Q.,  Li,  L.  and  Wong,  G.  Y.  C.  (1998).  Asymptotic  variance  of  the  GMLE  of  a 
survival  function  with  interval-censored  data.  Sankhya,  A.  60,  184-197. 

[e]  Yu,  Q.,  Schick,  A.,  Li,  L.  and  Wong,  G.  Y.  C.  (1998).  Asymptotic  properties  of  the 
GMLE  of  a  survival  function  with  case  2  interval-censored  data.  Statistics  &  Probability 
Letters  37,  223-228. 

[f]  Yu,  Q.  and  Wong,  G.  Y.  C.  (1998).  Consistency  of  self-consistent  estimators  of  a  discrete 
distribution  function  with  bivariate  right-censored  data.  Communication  in  Statistics. 
27,  1461-1476. 

[g]  Wong,  G.  Y.  C.  and  Yu,  Q.  (1999).  Generalized  MLE  Of  a  joint  distribution  function 
with  multivariate  interval-censored  data.  Journal  of  Multivariate  Analysis  69,  155-166. 

[h]  Schick,  A.  and  Yu,  Q.  (1999).  Consistency  of  the  GMLE  with  mixed  case  interval- 
censored  data.  Scandinavian  Journal  of  Statistics.  (In  press) . 

[i]  Li,  L.  and  Yu,  Q.  (1997).  Self-consistent  estimators  of  survival  functions  with  doubly- 
censored  data.  Communication  in  Statistics,  2609-2623. 

[j]  Wong,  G.  Y.  C.,  Bradlow,  H.  L.,  Sepkovic,  D.,  Mehl,  S.,  Mailman,  J.  and  Osborne, 
M.  P.  (1997).  A  dose-ranging  study  of  indole-3-carbinol  for  breast  cancer  prevention. 
Journal  of  Cellular  Biochemistry  Supplements  28/29,  111-116. 

Copies  of  the  articles  are  included  in  APPENDICES. 

•  2  submitted  manuscripts: 

[a]  Yu,  Q.,  Li,  L.  and  Wong,  G.  Y.  C.  Asymptotic  properties  of  NPMLE  with  mixed 
interval-censored  data.  (Submitted  to  the  Annals  of  the  Institute  of  Statistical  Mathe¬ 
matics) 

[b]  Yu,  Q.  and  Wong,  G.  Y.  C.  A  modified  GMLE  with  doubly-censored  data.  (Submitted 
to  Australian  Journal  of  Statistics). 

•  7  abstract  presentations: 

[a]  Q.  Yu,  G,  Y.C.  Wong  and  L.  Ye.  Estimation  of  a  survival  function  with  interval-censored 
data,  a  simulation  study  on  the  redistribution- to-the-inside  estimator.  1995  Joint  sta¬ 
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tistical  meetings  at  Orlando,  Florida,  U.S..  August  13-17,  1995. 

[b]  Q.  Yu,  L.  Li  and  G.Y.C.  Wong  (1996)  Variance  of  the  MLE  of  a  survival  function  with 
interval-censored  data.  1996  Sydney  international  statistical  congress,  Australia.  July 
8-12,  1996. 

[c]  Q.  Yu,  L.  Li  and  G.Y.C.  Wong  (1996)  Variance  of  the  MLE  of  a  survival  function  with 
doubly-censored  data.  1996  Joint  statistical  meetings  at  Chicago,  Illinois,  U.S..  August 
4-8,  1996. 

[d]  Q.  Yu  and  L.  Li.  Asymptotic  properties  of  self-consistent  estimators  with  doubly- 
censored  data.  1997  Joint  statistical  meetings  at  Anaheim,  California,  U.S..  August 
10-14. 

[e]  Yu,  Q.  and  G.Y.C.  Wong.  Asymptotic  Properties  Of  Self-Consistent  Estimators  of  A 
Survival  Function  ICS  A  1997  Applied  Statistics  Symposium  at  Rutgers  University,  New 
Jersey,  U.S..  May  30  -  June  1,  1997. 

Copies  of  the  abstracts  are  included  in  APPENDICES. 

•  computer  programs  for  asymptotic  inferences  of  GMLE  at  the  internet  site 
QYU@math.binghamton.edu 

•  a  proposal  entitled  “Statistical  analysis  of  multivariate  interval-censored  data  in  breast 
cancer  follow-up  studies”  based  on  work  support  by  this  award  has  been  funded  by 
USAMEMC  from  7/1/99  to  6/30/02  to  George  Y.  C.  Wong  as  principal  investigator. 

•  a  proposal  entitled  “Cox  regression  model  for  interval-censored  data  in  breast  can¬ 
cer  follow-up  studies”  based  on  work  supported  by  this  award  has  been  submitted  to 
USAMRMC  since  June  15,  1999  with  George  Y.C.  Wong  as  the  principal  investigator, 
and  Qiqing  Yu  as  co-investigator. 

F.  CONCLUSIONS 

In  the  four  years  of  our  DOD  grant,  we  have  successfully  accomplished  our  research 
objectives  on  the  asymptotic  inference  of  the  GMLE  of  the  survival  function  for  interval- 
censored  data.  Under  different  interval  censorship  models,  we  have  established  consistency, 
asymptotic  normality  and  asymptotic  efficiency  of  the  GMLE.  When  both  the  survival  time 
and  the  censoring  vector  take  on  finitely  many  values,  we  have  established  similar  asymptotic 
properties  for  the  maximum  likelihood  estimators  of  the  regression  coefficients  and  the 
GMLE  of  the  survival  function  at  a  given  covariate  pattern  of  the  Cox  regression  model  for 
interval-censored  data.  We  have  made  available  to  the  public  a  set  of  computer  programs 
for  carrying  out  the  asymptotic  generalized  maximum  likelihood  inference  procedures  for  all 
types  of  interval-censored  data.  The  results  from  our  research  will  provide  clinicians  and  basic 
science  researchers  in  breast  cancer  with  a  set  of  fundamentally  important  statistical  tools  for 
the  analysis  of  interval-censored  data  that  are  encountered  in  breast  cancer  chemoprevention 
studies,  and  relapse  follow-up  studies  in  which  the  time-to-event  variable  cannot  be  exactly 
observed. 
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Our  research  also  indicates  that  asymptotic  inferences  for  the  parameters  of  the  Cox  re¬ 
gression  model  for  interval-censored  data  cannot  be  feasibly  obtained  by  a  standard  iterative 
algorithm,  such  as  the  Newton-  Raphson  algorithm.  Our  investigations  into  Cox  regression 
in  this  grant  have  inspired  us  to  consider  a  computational  simpler  two-step  estimation  pro¬ 
cedure  for  the  parameters  of  the  Cox  model.  We  have  consolidated  our  ideas  into  a  proposal 
entitled  “Cox  regression  model  for  interval-censored  data  in  breast  cancer  follow-up  studies” , 
which  has  been  submitted  to  the  USAMRMC  for  consideration  for  funding. 
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ABSTRACT.  Interval-censored  data  arise  in  a  wide  variety  of  application  and  research  areas 
such  as,  for  example,  AIDS  studies  (Kim  et  al.,  1993)  and  cancer  research  (Finkelstein,  1986; 
Becker  &  Melbye,  1991).  Peto  (1973)  proposed  a  Newton-Raphson  algorithm  for  obtaining  a 
generalized  maximum  likelihood  estimate  (GMLE)  of  the  survival  function  with  interval- 
censored  observations.  Turnbull  (1976)  proposed  a  self-consistent  algorithm  for  interval- 
censored  data  and  obtained  the  same  GMLE.  Groeneboom  &  Wellner  (1992)  used  the  convex 
minorant  algorithm  for  constructing  an  estimator  of  the  survival  function  with  “case  2” 
interval-censored  data.  However,  as  is  known,  the  GMLE  is  not  uniquely  defined  on  the 
interval  [0,  oo).  In  addition,  Turnbull’s  algorithm  leads  to  a  self-consistent  equation  which  is 
not  in  the  form  of  an  integral  equation.  Large  sample  properties  of  the  GMLE  have  not  been 
previously  examined  because  of,  we  believe,  among  other  things,  the  lack  of  such  an  integral 
equation.  In  this  paper,  we  present  an  EM  algorithm  for  constructing  a  GMLE  on  [0,  oo). 
The  GMLE  is  expressed  as  a  solution  of  an  integral  equation.  More  recently,  with  the  help  of 
this  integral  equation,  Yu  et  al .  (1997a,  b)  have  shown  that  the  GMLE  is  consistent  and 
asymptotically  normally  distributed.  An  application  of  the  proposed  GMLE  is  presented. 

Key  words:  generalized  maximum  likelihood  estimator,  EM  algorithm,  interval  censorship, 
self-consistency 


1.  Introduction 

Interval-censored  data  are  frequently  seen  in  medical  studies,  pharmaceutical  applications, 
and  engineering  research.  Let  Xi,X2,  . . .,  Xn  denote  a  random  sample  of  observations  of  a 
random  variable  X,  called  the  failure  time,  with  distribution  function  F,  and  let 
(Ti,  Z 0,  ( Y2 ,  Z2),  ...,  (T„,  Zn)  denote  a  random  sample  of  observations  of  a  random  vector 
(L,  R\  called  the  censoring  vector,  with  joint  distribution  function  G(/,  r),  where  with 
probability  one,  L  R.  As  is  common,  define  S(x)  =  1  -  F(x)  as  the  survival  function  of  F.  For 
each  observation  Xu  there  is  a  corresponding  censoring  vector  (Yu  Zi).  The  failure  time  X,-  is 
observed  if  it  is  outside  the  open  interval  (Yu  Z().  When  X,-  is  within  (Yu  Z,),  we  only  observe 
(7,-,  Zi)  but  not  the  value  Xu  i.e.  X,  is  censored.  When  Z{  (Yj)  equals  oo  (0),  the  failure  time  X,  is 
subject  to  a  right  (left)  censorship.  If  only  min  {max  {X,-,  7,-},  Z,  }  is  observed,  we  say  the  failure 
time  is  subject  to  a  double  censorship.  It  is  readily  seen  that  the  interval  censoring  scheme 
contains  right  censoring  and  left  censoring  schemes  as  special  cases.  If  the  functional  form  of 
the  distribution  function  F  is  known,  we  only  need  to  estimate  the  parameters  of  F.  However, 
when  the  functional  form  of  F  is  unknown,  a  non-parametric  approach  must  be  used.  This  paper 
focuses  on  the  latter. 

Kaplan  &  Meier  (1958)  proposed  the  product  limit  estimator  (PLE)  to  estimate  the  survival 
function  when  data  are  right-censored.  There  have  been  extensive  studies  concerning  the  PLE. 
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Doubly-censored  data,  which  treat  right-censored  and  left-censored  data  as  special  cases  are 
investigated  by  many  authors.  A  self-consistent  estimator  (Efron,  1967)  of  the  survival  function 
with  doubly-censored  data  as  well  as  various  properties  of  the  estimator  such  as  stro  g 
convergence,  asymptotic  normality,  etc.,  are  established  (see,  for  example  Turnbull  1974 
Chang  1990;  Gu  &  Zhang,  1993).  The  self-consistent  estimator  is  implicitly  expressed  a 
solution  of  an  integral  equation.  No  closed  forms  of  the  estimator  have  been  Panted  For 
arbitrarily  interval-censored  data,  Peto  (1973)  proposes  a  Newton-Raphson  algonthm  to  obtain 
a  generalized  maximum  likelihood  estimator  (GMLE)  (see  Kiefer  &  Wolfow.tz,  1956;  Johansen, 
1978)  of  the  survival  function.  Turnbull  (1976)  derives  a  self-consistent  algonthm  and  shows 
that  the  algorithm  converges  monotonically  to  the  GMLE.  This  GMLE  is,  however,  not  uniquely 
determined  in  innermost  intervals  (see  definition  below).  Furthermore,  Turnbull  s  self-consistent 
equation  is  not  in  the  form  of  an  integral  equation.  Studies  about  arbitrarily  interval-censored 
data  are  not  as  fruitful  as  those  mentioned  above  due  to,  among  other  things,  lack  of  an  mtegr 
equation  for  the  GMLE.  Tsai  &  Crowley  (1985)  discuss  connections  among  the  GMLE,  the 
algorithm,  and  the  self-consistent  estimators  for  incomplete  data,  focusing  on  right  censoring 
and  double  censoring  cases,  taking  advantage  of  availability  of  the  integral  equations  for  the 
latter  two  models.  Groeneboom  &  Wellner  (1992)  use  the  convex  m.norant  algonthm  for 
computing  the  MLE  of  the  survival  function  with  “case  2”  interval-censored  data.  The  case  2 
interval  censoring  is  the  same  as  arbitrary  interval  censonng  descnbed  above  except  that  the 
exact  observations  can  never  be  observed  (thus  it  is  a  special  case  of  the  arbitrary  interval 
censoring).  Yu  &  Wong  (1996a,  b)  consider  a  special  case  of  interval-censored  data.  They 
assume  that  any  two  censoring  intervals  are  either  disjoint  or  one  includes  another_  “ 
assumption  covers  a  wide  variety  of  situations.  They  derive  an  explicit  expression  for  the  GMLE 

of  the  survival  function  and  then  prove  that  the  estimator  is  strongly  consistent. 

Since  Turnbull’s  self-consistent  GMLE  is  not  uniquely  defined  on  innermost  intervals,  it  is 
not  convenient  to  use  the  estimator  if  the  data  are  heavily  censored.  In  this  paper,  we  propose  an 
EM  approach  to  construct  a  GMLE  that  is  defined  on  the  interval  [0,  oo).  This  approach  also 
gives  an  integral  equation  expression  for  the  GMLE.  More  recently,  with  the  help  of  this  integral 
equation,  Yu  et  al.  (1997a,  b)  prove  the  uniform  strong  consistency  and  asymptotic  normality  o 

the  GMLE.  ,  c  A 

The  organization  of  the  paper  is  as  follows.  Section  2  provides  the  necessary  definitions  and 

background.  In  section  3,  we  prove  the  convergence  of  the  proposed  EM  algonthm  and  show 
that  it  converges  to  the  same  GMLE  as  Turnbull’s.  An  application  of  the  estimator  denved  is 

presented  in  section  4. 


2.  Algorithms 

Following  the  notation  of  section  1,  assume  that  the  vectors  (T„  Z;, X,),  i  =  1, >•••*«>  a^e 
mutually  independent,  and  that  X,  and  (T„  Z,)  are  also  independent.  If  Yi<Xi<Zi,  the 
censoring  interval  (Yh  Z,)  rather  than  the  failure  time  X,  is  observed  and  we  denote  the 
observation  by  an  open  interval  (L„  R,)  =  (Yu  4*  if*  is  outside  (Y„  Z,)  we  observe  the  exac 
failure  time.  In  the  latter  case,  we  define  L,  =  *  =  R„  and  call  X,  or  the  closed  interval  [L„  *] 
an  exact  observation.  Thus  we  may  assume  that  the  final  observations  are 

„  if  L,  =  R, 


{Li,  Ri }  -  j  (l'5  Rj)  jf  L.  <  R 


(note:  some  of  the  intervals  are  collapsed  to  points),  and,  without  loss  of  generality  (WLOG) 
assume  that  L,  L2  ^  ^  Ln.  Let  i?  denote  the  set  1  ^  i  ^  and  M  the  set 
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1  f  '  *  ")•  where  (ft.  r<}  are  realizations  of  {Lj,  R,}.  Ranking  the  In  points  ( n  Is  and  n 
rs)  in  increasing  order  yields  a  sequence,  say  c,  c2  =s  •  •  •  Cln.  If  there  exist  ties  in  the 
observations,  we  suppose  that 

1.  R,  has  smaller  rank  than  Lj  if  R,  =  Lj<Rj; 

2.  Li  has  larger  rank  than  Rj  if  Lj  =  Rj  >  Lj; 

3.  If  {L^  Rj)  =  {Lj,  Rj)  and  i  < j  then  they  are  ranked  as  Lj  Lj  =s  Rj  r._ 

Define  an  innermost  interval  {p,  q }  to  be  the  non-empty  intersection  of  observed  intervals 
{Li,  Ri}  such  that  {p,  q)  n  {/,„  Rj}  is  either  an  empty  set  or  {p,  q).  Notice  that  every  exact 
observation  comprises  a  closed  innermost  interval  and  that  distinct  innermost  intervals  are 
disjoint.  Suppose  that  there  are  m  (=s  n)  such  distinct  (open  or  closed)  innermost  intervals: 
{p\,q\},  {pnu  qm},  where/?,  «  q\  «  •  •  •  **pm  qm  and 

{»,»} -{ {’’‘■"•l 


.  [Ph  q,]  if  Pi  -  qt. 

Turnbull  (1976)  provides  a  self-consistent  algorithm  for  obtaining  the  GMLE  of  S,  and  shows 
that  the  GMLE  assigns  weight  on  innermost  intervals  only.  Specifically,  define  an  indicator 
.  function  <5,y  =  1  if  {pj,  qj }  c  {/,-,  /■,-},  and  0  otherwise.  Let 

dqSj 


Piii s) 


1  ^iksk 


where  s-(su  ....  $„)  are  the  masses  assigned  to  the  corresponding  innermost  intervals, 
satisfying  ^Jli  si  —  L  Sj  >  0,  1  i  m.  Write 


1  " 


The  GMLE,  and  hence  the  self-consistent  estimator  of  S,  can  be  obtained  by  the  following 
iterative  procedure.  6 

1.  Set  the  initial  values  s?  =  1/m,  1  j  ^  m. 

2.  Compute  p0(s),  and  set  sj  =  jtj(s°). 

3.  Repeat  step  2  by  replacing  s°  with  s\  and  so  on. 

This  procedure  converges  monotonically  to  the  estimate  of  the  weight  s.  Although  the 
GMLE  of  S(x)  can  be  formed  as 


S(x)  = 

*4i>x 


'■  &  Vj{Pj>  <?/} 


(2.1) 


we  only  know  the  amount  of  weight  on  innermost  intervals  but  not  the  way  that  the  weight  varies 
within  the  innermost  intervals.  We  now  present  an  EM  algorithm  for  obtaining  the  GMLE  of 
S(x).  The  proposed  GMLE  assigns  the  same  weight  on  innermost  intervals  as  Turnbull’s  and 
describes  the  distribution  of  the  weight  within  the  innermost  intervals.  Meanwhile,  an  expression 
for  the  GMLE  is  obtained. 

Let  H0(x)  denote  a  strictly  increasing  initial  distribution  function  on  [0,  a),  where 

a  ^  max  {/y;  r,  6  (for  example,  H0(x)  -  1  -  exp  (-*),  *  ^  0.  A  choice  of  the  initial  ff0  is 
given  in  section  4)  and  define 


#!<*)  =  - 
n 


H0(x)  -  H0(lj) 
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where  1(A)  denotes  the  indicator  function  of  the  events  and  f(x~)  —  lim,^  f(t).  Then  define  a 
distribution  function 


H'2(x)  =  G  Oi,  n))  +  I(x  G  [r„  00)). 

In  other  words,  we  truncate  distribution  H\  on  each  censored  interval.  Let 
1 


//2(*) =-£>!(*)• 


Then  use  H*i  as  an  initial  distribution  function  and  repeat  the  above  procedure  to  obtain  H 3. 
More  specifically,  on  the  ki h  iterative  procedure,  is  calculated  by 

**(*)= ;£>*<*>’ 

71  im  1 

where,  when  //  =  H[(x)  =  0  if  x  <  1  if  x  2*  lh  and,  when  /,  <  rti 

H[{x)  =  /(K  6  (/,,  r.))  +  /(*  e  [r„  00)). 

nk-\yri—)  —  Jik-iUu 

In  terms  of  conditional  expectation, 


f/t(x)  =  EHk_t 


71  1=1 


This  is  an  EM  algorithm  (Tsai  &  Crowley,  1985).  A  proof  of  the  convergence  of  the  EM 
algorithm  is  given  in  the  next  section.  Thus,  the  limiting  distribution,  say  //,  is  a  self-consistent 
estimator  of  F.  It  is  known  that  Turnbull’s  algorithm  is  also  an  EM  algorithm.  The  difference  of 
these  two  EM  algorithms  is  previously  described  in  the  paragraph  following  the  definition  of 
Turnbull’s  algorithm.  In  addition,  it  is  easy  to  see  that  in  terms  of  convergence  rate  one  is  not 
superior  to  the  other. 


3.  Main  results 

We  first  make  sure  that  the  proposed  EM  algorithm  is  well-defined,  namely  we  need  to  guarantee 
that  the  denominators  involved  in  Hk  are  not  zero.  This  is  assured  by  the  following  lemma.  The 
proof  of  it  is  simply  by  induction  on  k  and  is  omitted. 

Lemma  1 

Fork  2*  1,  Hk(n-)  -  Hk(li)  2*  \/n. 

We  now  prove  that  the  EM  algorithm  converges.  The  proof  is  similar  to  th.  2.1  of  Tsai  & 
Crowley  (1985). 

Theorem  I 

As  k  — >  00,  Hk(x)  converges  to,  say  H(x). 

Proof.  By  the  definition  of  the  EM  algorithm,  the  initial  estimator  H\  has  its  weight  on 
observations  Ri,  i  =  1,  . . .,  n}  only.  This  implies  that  it  is  the  EM  algorithm  for  incomplete 
multinomial  data,  which  belong  to  exponential  family.  Thus  by  th.  2  of  Wu  (1983)  the  EM 
algorithm  converges. 


of 
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We  now  consider  a  transformation  of  the  observed  censoring  intervals.  The  transformed  data 
make  the  proofs  simpler  and  produce  the  same  self-consistent  estimate  as  do  the  original  data. 
Let  {//,  r,},  1  <  i  <  *i,  be  the  original  data,  and  let  {p„  <7/},  1  ^  i  m,  be  the  innermost 
intervals.  For  convenience,  define  pm+ 1  =  00  and  qo  =  0.  The  transformation  proceeds  as 
follows.  For  any  r,  (1)  if  there  is  not  any  exact  observation  at  qiy  then  move  all  rs  between  q(  and 
Pi+\  to  qh  (2)  otherwise,  move  all  rs  between  qt  and  pM  to  the  smallest  r  that  is  greater  than  q{\ 
similarly,  if  there  is  not  any  exact  observation  at  p/,  then  move  all  /s  between  p,-  and  q^\  to  pt-, 
otherwise,  move  all  Is  between  p,  and  q^\  to  the  largest  /  that  is  smaller  than  p,-.  We  call  the 
transformation  S-transformation.  We  use  {L{,  £;}  to  denote  the  S-transformed  data.  To  illustrate 
the  transformation,  consider  the  following  example. 

Example  L  If  the  original  data  are 

(1  (2  (3  *4  )2  (5  )5  )l  *6  (7  )3  *8  )7 
where  (j  )•  denotes  the  yth  censoring  interval,  then  the  S-transformed  data  are 
(1,2,3  *4  )2  (5  )s,l  *6  (7  )3  *8  )l 

or  l\  =  V2  =  1'3  =  /3  <xA  <  ri  =  r2  ^  I5  =  Is  <  r's  =  r\  =  r5  <x6  <  IS  =  h  <  rj  =  r3  <x8  <  r'7 
—  r 7  (we  pretend  that  l\  <  l' 2  ^  /3  and  r£  ^  rj). 

It  is  important  to  note  that  the  S-transformation  does  not  change  the  innermost  intervals  and 
(L,  R)  contains  an  innermost  interval  if  and  only  if  (Z/,  R')  contains  the  same  innermost  interval. 
Noting  that  the  likelihood  function  can  be  written  as 

*-n  E  dikSk  j  (see  Peto,  1973), 

i=\  \k=\  ) 

we  see  that  the  S-transformation  does  not  change  the  likelihood  function.  Since  the  GMLE  of  s 
is  uniquely  determined  by,  and  has  weight  only  on,  innermost  intervals  (Peto,  1973),  the  original 
data  and  the  corresponding  S-transformed  data  produce  the  same  GMLE  of  s  (Yu  &  Wong, 
1996a).  Hence,  from  now  on,  we  use  the  following  convention. 

Convention 

We  suppress  the  word  S-transformation  and  assume  that  the  data  are  already  S-transformed 
unless  otherwise  specified. 

Notice  that  the  GMLE  of  S  is  entirely  determined  by  s  (see  (2.1)),  and  thus  the  original  data 
and  the  transformed  data  give  the  same  GMLE  of  S. 

Theorem  2 

Suppose  that  the  initial  distribution  Hq  is  strictly  increasing  on  [0,  a)  where  a  is  defined  in 
section  2.  Let  {p,-,  <?,}  and  let  &  denote  the  support  of  the  limiting  distribution 

function  H  =  lim^oc  Z4-  Then  &  C 

Proof  It  is  sufficient  to  prove  that  non-innermost  intervals  do  not  have  weight.  If  (cm,  cm+ 1) 
is  a  non  innermost  interval,  then  it  must  be  one  of  the  following  cases: 

j  (a)  (lj,Xj+ 1),  where  lj<Xj+\  <rp  (remember  the  convention,  i.e.  there  is  no  additional  /,  or 
n  within  (/;,  xy+i]) 

(b)  (xj,  lJ+ 1); 

(c)  (xj,  n),  where  )  <  j  and  /,  <  Xy  <  n; 
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(d)  (rh  xj); 

(e)  (rp,  lj ),  where  lP<rp<  lj  <  r}. 


We  now  prove  that  none  of  the  above  non-innermost  intervals  has  weight.  First  consider 
(a).  Note  that 


1  ^-^H(Xj+  J  )  H (lj)  jft  —  fj  \\ 

H(xj+\~)  -  H {lj)  =  ~E  »/-  v  I//'. a  Wj  e  {l"  ri)) 


nfr  H(r,-)-H(l,) 


=  (H(xj+i-)-H(lj)}-Y 


1  v-'  I(lj  €  r ,)) 


Thus,  either  H(Xj+\—)  -  H (lj)  =  0,  or 


1  ^  /(/,•  e  (lj,  r,» 

//(xy+,  -)  -  //(/,)  ±  0  and 


=  1. 


In  addition,  if  (3.1)  is  true, 

H(xj+ 1 )  “  #((?)  =  [#(*/’+ 0  —  ^((/)]  ~  ^2  H{ri-)  —  H(Ji) 


+!£['(*■  G  [r/?  ^  " /(/> 6  [r/’  °°))] 

i=i 


=  [ff(x,+t  )-//(/>)]+ 1 

which  is  impossible.  Thus,  H(xJ+i  — )  —  H(lj)  =  0. 

Now  consider  (b)  (cm,  cm+i)  =  (xy,  /y+i).  Note 

1 H(l,+\  -)  -  H(xj)  rn  _ 

- ««  -;g  gft)  ,(,w- e  ('" rJ) 

iQ  /(/,+,  -€(//,r,)) 

=  [//(/y+l  -)  -  W(xy)]  -  X!  _  //(/,) 


(3.1) 


Thus,  either  //(/;+ 1  — )  —  H(xj)  =  0,  or 


//((,+,-) -tf(xy)/0  and  ~Y 


1  ^  I(lj+\—  €  (//,  r/)) 


H(n-)-H(lj) 


=  1. 


(3.2) 


Furthermore,  it  is  readily  seen  that,  if  (3.2)  is  true, 
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The  proofs  of  (c)  and  (d)  are  similar  to  that  of  (a)  and  (b). 
Finally  consider  (e)  (cm,  cm+i)  —  ( rp ,  /y),  where  lp<rp<  lj  <  rj. 
Notice  that 


Therefore, 


«(/-)  -  ff(r„)  =.  e  ft,  „» 


nfcmn-)-H(h)  . 

_  r//(7  _1  _  \i  I V  M  £  ^ 

-  [H(lj  )  #(>>)]  n  ^  //(r_)  _  //(/.)  • 


Thus,  either  H(lj—)  -  H(rp )  =  0,  or 


r„f  '  rrr  w  n  7  *  /(/y  £  (//,  r,)) 


We  now  show  that  case  (3.3)  leads  to  a  contradiction.  Suppose  that  (3.3)  is  true.  WLOG,  we  can 
assume  that  there  is  no  tie  at  rp  and  at  lj.  Then  exactly  one  of  the  following  cases  must  be  true: 

(e.l)  The  point  right  before  rp,  say  cm_i,  is  either  lp  or  an  exact  observation,  say  xpx\ 

(e.2)  cm- 1  is  a  left  endpoint,  say  lpvp2  ^  p . 

If  case  (e.l)  is  true,  it  is  easy  to  reach  a  contradiction  using  an  argument  similar  to  that  of 
case  (a)  by  considering  H(lj-)  -  H(rp )  and  H(rp)  -  7/(cm_i). 

Now  suppose  that  case  (e.2)  is  true.  By  the  definition  of  Ho ,  it  is  easy  to  see  that,  for  k  2s  0, 
dHk  assigns  positive  weight  to  the  intervals  ( lP2 ,  rp)  and  (rp,  lj).  We  now  prove  that  the  ratio 

Hk{lj)-Hk(rp) 

Hk(rp)-Hk(lpi) 


is  non-increasing  in  k .  In  fact, 

H?lj)^T<hI((rp’lj)C(l,'n)) 

Hk(ri-)  -  Hk(li)  _ . 

Hk{rp)  -  mPAI(rp  €  (/,,  r,]) 


-  //*(/«) 


mij)-Hk(rp)] 

W-flHW  (p’y)(,,) 

0  if  €  (/„  r,]  anrf  lj  £  r<) 

in  particular  if  i  =  p 
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U  Hk{lj)-Hk{rp) 


lj )  C  (//,  r,)) 


Hk± i (/;)  ~  gt+i (rP)  ~)  ~  ^ 

Hk+\(rp)  —  Hk+i(lPl)  Hk(rp)  —  Hk(lP2)  r/_  ^  n  _ 

/  .PPl — \  u  7T\1vp  r‘P 

jrf  Hk(ri~)  -  Hk(li) 


and  thus 


,  [Hk(lj)  -  Hk(rp)] 

'  [Hk(rp)  -  Hk(lP2)] 

.  [H0(lj)  -  H0(rp)] 

"  [H0(rp)-H0(1P2)Y 


Taking  limits  as  k  — *  oo  yields 

//(/,)  -  7/(rp)  <  [flo(/,)-i/oO>)] 

#(/>)  -  //(/P2)  "  [#<>(/>)  -  //0(/P2)] 

However,  if  case  (e.2)  is  true,  that  is,  if  HQj)  —  H(rp)  >  0  and  H(rp)  —  H(lP2)  =  0,  then 
.  . .  mi)  -  H(rp)  ^  [Ho(lj)  -  Ho(rp)]  <  f  ^ 


+oo  = 


tf(rp) -//(/„,)  [//oW-//o(/P2)] 


The  contradiction  implies  that  case  (e.2)  is  impossible.  Thus  (3.3)  is  impossible.  It  follows  that 
H(lj)  -  H(rp)  —  0.  This  completes  the  proof  of  theorem  2. 


Remark  1 .  The  result  of  theorem  2  does  not  depend  on  the  choice  of  the  initial  distribution 
Ho,  provided  that  Hq  is  chosen  by  its  definition  in  section  2.  Example  2  below  indicates  that  the 
strictly  increasing  restriction  on  Hq  is  necessary.  The  example  also  shows  that  a  self-consistent 
estimator  is  not  necessarily  a  GMLE. 


Example  2.  Suppose  there  are  only  two  censoring  intervals:  (0,  1)  and  (0.5,  1.5).  Let 

Ho(x)  =  xl(x  €  (0,  0.5))  ++€  [0.5,  oo))  +(x-  l)I(x  6  [1,  1  -5))  +  l/(x  €  [1 .5,  oo)). 

Then  Hk{x)  =  H0(x)  for  ifc  5*  1,  and  thus  H(x)  =  lim*_oo  Hk(x)  =  H0(x).  It  is  readily  seen  that 
non-innermost  intervals  (0,  0.5)  and  (1,  1.5)  each  has  weight  1/2,  but  the  innermost  interval  (0.5, 
1)  does  not  have  any  weight. 

The  next  theorem  shows  that  for  each  innermost  interval  the  EM  and  Turnbull’s  algorithms 
assign  the  same  weight  on  it. 

Theorem  3 

The  limiting  equation  for  the  EM  algorithm 

l  i  ,(x  e  <*• r,)) + Kx  e  [r-  0°» 

is  equivalent  to  Turnbull  s  self-consistent  equation 

n 

sj  =  £>,(*)/*. 

i=i 

Proof.  Let  dH  be  the  measure  induced  by  H  and  let  dS  be  the  measure  induced  by  the  self- 
consistent  estimate.  It  follows  from  Theorem  2  that  both  dH  and  dS  assign  weight  to  the 
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innermost  intervals  only.  Then  dH  assigns  weight  Wj  =  H(qj-)  -  H(pj)  (H(qj)  -  H{pj—))  to 
the yth  open  (closed)  innermost  interval  and  dS  assigns  weight  sj  to  the  yth  innermost  interval.  It 
suffices  to  show  that  uys  satisfy  (3.5)  and  SjS  satisfy  (3.4). 

W.l.o.g.,  assume  that  pj  =  lj\  and  qj  =  rp,  where  j2  ^  j\ .  Note  that 


(1)  Wj  =  H(rn)  -  H(lj\  -)  if  Pj  =  qj\ 

(2)  WJ  =  H(rfl-)  -  H(lj\)  if  pj  <  qj. 


We  first  assume  pj  =  qj,  i.e.,  rj2  =  lj, .  Then 

»j  =  \  E  ^(rTT  HmnI(lj'  6  r,)) + £  £  /(/>'  6  [/i’  r'])> 

which  is  the  same  as 

E  ft.  „)>+!  E  n]* 

It  follows  that  if  pj  =  qj ,  then 

1  ^  dyVVy 

Wj=n4->^r^- 

,_l  E<5,'*w* 

*=i 

Similarly,  we  can  show  that  if  pj  <  qj ,  then 

1  dtyWy 


/”1 


*=1 


This  is  the  same  as  the  equation  forays,  i.e.,  (3.5). 
Analogously  we  can  show  that  sys  satisfy  (3.4). 


As  mentioned  before,  the  self-consistent  GMLE  of  F(x)  is  not  uniquely  defined  for 
x  €  (pi ,  qi)  if  pi  <  qi  (Peto,  1973).  For  the  proposed  EM  algorithm,  the  value  of  the  estimate 
H(x)  for  x  €  (pi ,  qi)  is  uniquely  determined  once  H0  is  determined.  It  is  readily  seen  that  the 
GMLE  defined  by  (3.4)  can  be  written  as  a  solution  of  an  integral  equation  as  follows. 


H(x)  = 


'<“*<  r> + r  r ,<r  *  r) 


(3.6) 


where  <7*(/,  r)  is  the  distribution  function  of  the  observable  random  vector  (L,  R ).  Note  that 
equation  (3.6)  needs  to  be  modified  if  we  define  censoring  intervals  to  be  closed  [F,  Z]  rather  than 
open  (F,  Z)  as  in  this  paper.  Combining  theorems  1 , 2,  and  3,  we  can  prove  the  following  theorem. 


Theorem  4 

The  limiting  distribution  H(x)  of  the  EM  algorithm  is  the  GMLE  of  F,  and  is  independent  of  Ho 
for  x  i 


Proof  By  theorem  2,  the  sum  of  weights  on  the  innermost  intervals  equals  unity,  and  by 
theorem  3,  the  limiting  equations  of  the  EM  and  Turnbull’s  algorithms  are  the  same.  Since 
Turnbull’s  algorithm  converges  to  the  GMLE,  provided  that  the  support  of  the  estimate  is  on  the 
union  of  innermost  intervals  (Turnbull,  1976),  the  EM  algorithm  converges  to  the  GMLE,  too. 
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In  addition  since  the  weight  of  the  GMLE  on  innermost  intervals  is  uniquely  determined  by  the 
observations  given  (Peto,  1973),  the  value  of  H(x),  x  £  3G,  does  not  depend  on  the  choice 
of  H0. 


4.  Applications 

In  this  section,  we  shall  illustrate  the  smoothed  GMLE  technique  by  using  a  real  data  set.  It  is 
readily  seen  that  the  choice  of  the  initial  distribution  H0  does  not  affect  the  total  amount  of 
weight  on  innermost  intervals  but  does  affect  the  value  of  S(x)  when  x  €  (p,  q ),  an  innermost 
interval.  We  present  an  intuitive  approach  to  choose  H0. 

We  use  the  midpoint  method.  For  any  1  /  =£  n,  if  r<  =  oo,  we  ignore  the  interval  [/,-,  r,],  and 

if  r,  <  oo,  we  let  m,  denote  the  midpoint  of  [/„  r,].  Suppose  there  are  k  such  midpoints  and, 
WLOG,  suppose  that  they  are  distinct  with  m\  <m2  <  ■  ■  ■  <m-  The  initial  distribution 
function  Ho  is  constructed  as  follows.  Firstly  construct  an  empirical  cumulative  distribution 
function  (EDF)  based  on  the  midpoints  {m,-,  1  *£  i «  k}.  The  EDF  jumps  at  midpoints  and  is 
constant  between  two  consecutive  midpoints.  Secondly,  for  1  «  i  =£  k  -  1,  let  r,  be  the  centre 
point  of  [mh  mi+ 1]  and  connect  points  (ff,  i/k)  and  (t,+i,  (i  +  1)A)  with  a  line  segment.  Finally 
connect  (<*_ ,,  (k  -  \)/k)  and  (/•*,  1)  as  well  as  (0,  0)  and  (mx,  \/k)  with  a  line  segment, 
respectively,  where  r*  =  max,^„{r,:  r,  <oo}.  This  constructed* polygonal  line  is  the  initial 
distribution  Ho  which  is  continuous  and  strictly  increasing  on  [0,  r*]. 

We  now  use  a  data  set  to  demonstrate  the  proposed  EM  estimator. 


Example  3.  The  following  data  have  been  used  by  Finkelstein  &  Wolfe  (1985)  to  compare 
two  different  treatments  for  breast  cancer  patients.  The  censoring  intervals  (in  months)  arose  in 
the  follow-up  studies  for  patients  treated  with  radiotherapy  and  chemotherapy  or  with  radio¬ 
therapy  alone.  The  failure  time  is  the  time  until  cosmetic  deterioration,  as  determined  by  the 
appearance  of  breast  retraction.  The  data  are  reproduced  in  Tables  1  and  2.  The  estimate  of  S  for 
each  data  set  is  obtained  using  the  technique  derived  in  this  paper.  The  comparison  of  the 


survival 

functions  with 

the  treatments  is  given  in 

Fig.  1. 

Table  1. 

Radiotherapy  and  chemotherapy  (8,  12] 

(0,  22] 

(24,31] 

(17, 27] 

(17,23] 

(24,  30] 

(16,  24] 

(13,oo) 

(14,  17] 

(11,  13] 

(16,  20] 

(18, 25] 

(17,  26] 

(32,  oo) 

(23,  oo) 

(44,  48] 

(0,5] 

(5,8] 

(12,  20] 

(11,  oo) 

(33,  40] 

(31,oo) 

(13,39] 

(19, 32] 

(34,  oo) 

(13,oo) 

(16,  24] 

(35,  oo) 

(15,22] 

(11,17] 

(22, 32] 

(10,  35] 

(30,  34] 

(13,oo) 

(10,  17] 

(8,21] 

(4,  9] 

(1 1,  oo) 

(14,  19] 

(4,  8] 

(34,  oo) 

(30,  36] 

(18,  24] 

(16,  60] 

(35,  39] 

(21,00) 

(11,20] 

(48,  oo) 

Table  2. 

Radiotherapy  alone  (45,  oo) 

(6,  10] 

(0,  7] 

(46,  oo) 

(oo) 

(7,  16] 

(17,oo) 

(7,  14] 

(46,  oo) 

(37,  44] 

(0,  8] 

(4,  11] 

(15,  oo) 

(11,15] 

(22,  oo) 

(46,  oo) 

(25,  37] 

(46,  oo) 

(26,  40] 

(46,  oo) 

(27,  34] 

(36,44] 

(46,  oo) 

(36, 48] 

(37,  oo) 

(40,  oo) 

(17,  25] 

(46,  oo) 

(11,18] 

(38,  oo) 

(5,  12] 

(37,  oo) 

(0,  5] 

(18,oo) 

(24,  oo) 

(36,  oo) 

(5,11] 

(19,  35] 

(17, 25] 

(24,  oo) 

(32,  oo) 

(33,  oo) 

(19,  26] 

(37,  oo) 

(34,  oo) 

(36,  oo) 

- - 
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Abstract  The  self-consistent  estimator  is  commonly  used  for  estimating  a  survival  function 
with  interval-censored  data.  Recent  studies  on  interval  censoring  have  focused  on  case 
2  interval  censoring,  which  does  not  involve  exact  observations,  and  double  censoring, 
which  involves  only  exact,  right-censored  or  left-censored  observations.  In  this  paper,  we 
consider  an  interval  censoring  scheme  that  involves  exact,  left-censored,  right-censored 
and  strictly  interval-censored  observations.  Under  this  censoring  scheme,  we  prove  that 
the  self-consistent  estimator  is  strongly  consistent  under  certain  regularity  conditions. 

Key  words  and  phrases:  Case  2  interval-censored  data,  exact  observations,  nonpara- 
metric  maximum  likelihood  estimator,  self-consistent  algorithm,  strong  consistency. 

1.  Introduction 

Recent  studies  of  interval  censoring  have  focused  on  case  2  interval-censored  (IC)  data, 
which  involve  a  time-to-event  variable  X  whose  value  is  never  observed  but  is  known  to 
lie  in  the  time  interval  between  two  consecutive  inspection  times  Y  and  Z.  Case  2  interval 
censoring  arises  naturally  in  a  longitudinal  follow-up  study  in  which  the  event  of  interest 
cannot  be  easily  observed  (for  instance,  cancer  recurrence,  elevation  of  levels  of  a  biomaker 
without  any  noticeable  symptoms). 

In  this  paper,  we  consider  IC  data  which  consist  of  both  case  2  IC  data  and  exact 
observations.  We  call  such  data  mixed  IC  data.  Mixed  IC  data  do  arise  in  clinical  follow¬ 
up  studies.  In  a  cancer  follow-up  study  in  which  a  tumor  marker  (for  instance,  CA  125 
in  ovarian  cancer)  is  available,  a  patient  whose  marker  value  is  consistently  on  the  high 
(or  low)  end  of  the  normal  range  in  repeated  testing  is  usually  monitored  very  closely  for 
possible  relapse.  If  such  a  patient  should  relapse,  then  time  to  clinical  relapse  can  often  be 
accurately  determined,  and  an  exact  observation  is  obtained.  However,  if  a  patient  is  not 
under  close  surveillance,  and  would  seek  help  only  after  some  tangible  symptoms  of  the 
disease  have  appeared,  then  time  to  relapse  most  likely  has  to  be  specified  to  be  within 
the  dates  of  two  successive  clinical  visits. 
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Another  situation  in  which  such  mixed  IC  data  can  occur  is  in  the  usual  right-censored 
survival  analysis  where  actual  dates  of  events  are  not  recorded,  or  missing,  for  a  subset 
of  the  study  population,  and  can  be  established  only  to  within  specified  intervals.  An 
example  from  the  Framingham  Heart  Study  was  presented  by  Odell  et  al.  (1992).  In  this 
large-scale  longitudinal  heart  disease  study,  time  of  occurrence  of  coronary  heart  disease 
(CHD)  is  recorded  for  almost  every  participant.  However,  time  of  first  occurrence  of  the 
CHD  subcategory  angina  pectoris  may  be  specified  only  as  between  two  clinical  visits, 
several  years  apart,  for  some  of  the  participants  who  suffered  from  angina  pectoris. 

For  case  2  IC  data,  Groeneboom  and  Wellner  (1992)  proposed  the  iterative  convex 
minorant  algorithm  for  obtaining  the  nonparametric  MLE  (NPMLE)  of  the  distribution 
function,  F,  of  X.  The  consistency  of  the  NPMLE  and  the  asymptotic  distribution  of  an 
alternative  estimator  are  obtained  under  the  assumption  that  F  and  the  inspection  time 
distribution  are  both  continuous  and  some  additional  regularity  assumptions.  Under  the 
only  assumption  that  the  random  inspection  times  are  discrete,  Yu  et  al.  (1998)  proved 
the  strong  consistency  of  the  NPMLE.  They  further  established  the  asymptotic  normality 
of  the  NPMLE  by  requiring  that  the  inspection  times  to  take  on  only  finitely  many  values. 

Another  commonly  discussed  interval  censoring  scheme  is  double  censoring.  Data  are 
said  to  be  subject  to  double  censoring  if  they  are  exact,  left  censored  or  right  censored; 
however,  they  are  not  to  be  strictly  interval  censored.  For  doubly-censored  data,  the 
consistency  and  asymptotic  normality  of  the  self-consistent  estimator  (SCE)  have  been 
established  by  Turnbull  (1974),  Chang  and  Yang  (1987),  and  Gu  and  Zhang  (1993)  under 
different  assumptions. 

For  mixed  IC  data,  Peto  (1973)  obtained  the  NPMLE  of  F  using  a  Newton-Raphson 
type  algorithm.  Turnbull  (1976)  proposed  a  self-consistent  algorithm  for  estimating  F  and 
showed  that  the  associated  SCE  is  also  the  NPMLE.  This  SCE  has  been  widely  employed  in 
medical  applications.  See,  for  example,  Finkelstein  (1986)  and  Becker  and  Melbye  (1991). 
In  this  paper,  we  shall  establish  the  strong  consistency  of  the  SCE  under  the  assumption 
that  F  is  arbitrary  but  the  support  of  the  inspection  times  is  finite.  Although  the  NPMLE 
is  consistent  with  case  2  interval-censored  data  (Groeneboom  and  Wellner,  1992),  counter 
example  does  exist  and  shows  that  the  SCE  may  not  be  consistent  with  case  2  interval- 
censored  data  when  the  inspection  times  only  take  on  finitely  many  values  (Yu,  1997).  . 
Intuitively,  the  proof  for  the  consistency  of  the  SCE  should  be  different  from  that  of  the 
NPMLE.  We  shall  show  that  it  is  indeed  the  case  in  Sections  3  and  4. 

The  organization  of  the  paper  is  as  follows.  Section  2  presents  models  to  describe  the 
mixed  IC  data  and  two  algorithms  for  computing  the  SCE.  The  strong  consistency  of  the 
SCE  is  established  in  Sections  3  and  4.  Some  proofs  are  put  in  the  Appendix. 

2.  Models  For  Mixed  IC  Data 

We  shall  discuss  two  models  for  mixed  IC  data  in  this  section.  The  one  in  Section 
2.2  is  more  general  than  the  one  in  Section  2.1,  but  we  shall  show  that  in  terms  of  the 
properties  of  the  SCE,  it  suffices  to  consider  the  one  in  Section  2.1. 

2.1.  A  Simple  Model  For  Mixed  IC  Data 

Let  (Y,  Z)  denote  a  pair  of  extended  random  censoring  times  (oo  allowed).  Assume 
Y  <  Z  with  probability  one  (w.p.l),  and  X  and  ( Y,Z )  are  independent.  The  observable 
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mixed  IC  data  are  equivalent  to  a  random  interval 


[L,R\ 


(Y,  oo)  if  Y  <  X  and  Z  —  oo  (right  censored), 

(—00,  Z]  if  X  <  Z  and  Y  =  —00  (left  censored), 

(Y,ZJ  if  —  00  <  Y  <  X  <  Z  <  00  (strictly  interval  censored), 

[X,X]  if  X  £  (Y,  Z]  (exact). 


Let  [L{,  Ri\  ,2  =  1,  ...,  n,  be  a  random  sample  from  [L,  i?J  and  [Zj, r^J  be  a  realization 
of  [Li,Ri\.  Further,  let  Q(l,r )  =  P(L  <  l,R  <  r).  Following  Peto  (1973)  and  Turnbull 
(1976),  define  sample  innermost  intervals,  denoted  by  Lzb',<7jJ’s>  to  be  the  nonempty  inter¬ 
sections  of  the  intervals  [k,ri\  so  that  for  any  pair  of  intervals  \jpj,qj]  and  [k,ri\,  either 
\j)j,  qj\  C  [li,n\  or  [pj,qj\  D  [k,ri\  =  0.  Note  that  \jPj,qj\  denotes  an  half  open  interval 
if  pj  <  qj  and  a  closed  interval  if  pj  =  qj.  Moreover,  every  exact  observation  constitutes 
an  innermost  interval.  We  demonstrate  the  concept  of  innermost  interval  by  an  example. 


EXAMPLE  Suppose  n=5  and  the  observed  intervals  [/;, rjj  are  (0, 3],  (2, 5],  [4, 4],  (2, 00), 
and  (6,  7].  Then  there  are  three  innermost  intervals:  (2,  3],  [4,  4]  and  (6,  7]. 

Suppose  there  are  m  (<  n)  such  distinct  intervals:  \jp\,qi\,  Lp25<Z2j>  •  •  •)  LPm> <7mJ , 
where  Pi  <  qi  <  P2  <  ?2  <  •  •  •  <  qm-  Define  Sij  =  I([pj,qj\  C  [Z^r^J),  where  1(A) 
denotes  the  indicator  function  of  the  set  A. 

The  self-consistent  algorithm  (Turnbull,  1976)  for  obtaining  the  SCE  Fn  (which  assigns 
weight  to  innermost  intervals  only)  of  F  is  given  by 

Fn(x)  =  ^  snj,  X>  0, 

j:qj<x 


where  {sni, . . . ,  snm}  are  the  probability  masses  assigned  to  the  corresponding  innermost 
intervals,  and  satisfy  the  self-consistent  equations 


n 

Sni  =  ^  S 


^ ij  Snj 


n  EfcLl  SikSnk  ’ 


j  =  l,...,m. 


(2.2) 


Li,  Watkins  and  Yu  (1997)  proposed  an  alternative  approach  based  on  the  EM  algorithm 
for  obtaining  the  SCE  Fn  and  expressing  Hn  =  Fn  as  a  solution  of  an  integral  equation 

f  IrrT  -  g"m  HI  <x<  r)dQn(l,r)  +  -f'HR,  <  x),  Hn  £  0,  (2.3) 

J  Hn(r)  -  Hn(l)  n  ^ 

where  ©  =  {h:  h  is  a  nondecreasing  function  from  [—00, 00]  to  [0, 1]  such  that  h(— 00)  =  0 
and  h(oo)  =  1}  and  Qn(l,  r)  is  the  empirical  version  of  Q(l,r).  They  showed  that  with 
proper  initial  values,  algorithms  (2.2)  and  (2.3)  give  the  same  weight  snj  =  Hn(qj) — Hn(pj) 
to  the  innermost  interval  \jpj,  qj\ ,  when  it  is  not  closed,  or  the  same  weight  Hn(qj)  — 
Hn(pj—)  to  [j)j,qj\,  when  it  is  closed.  That  is,  Hn  and  Fn  are  equivalent.  We  shall  make 
use  of  expression  (2.3)  to  establish  the  strong  consistency  of  Fn  in  Sections  3  and  4. 
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Following  the  identifiability  assumption  given  in  Chang  and  Yang  (1987),  we  define 
K(x )  =  P{X  is  not  censored  \X  =  x]  for  each  x.  Let  ri  =  inf{x  :  if(x)  =  0}  and 
rr  =  sup{x  :  K(x)  =  0},  if  {x  :  K(x)  =  0}  ^  0.  Otherwise,  define  Tj  =  rr  =  oo.  For  each 
x  G  (r/,rr),  either  if  (a:)  =  0  or  if  (a;)  is  not  defined.  To  see  this,  it  suffices  to  show  that 
for  any  two  points  a  <  b  satisfying  if  (a)  =  if  (6)  =  0,  there  do  not  exist  x  G  ( a,b )  such 
that  K(x)  >  0.  In  fact,  if  (a)  =0  implies  that  P{Y  <  a}  >  P{Y  <  a  <  Z}  =  1.  Also, 
K(b)  =  0  implies  that  P{Z  >b}>  P{Y  <  b  <  Z}  =  1.  Thus 

P{Y  <Ti<Tr<Z}  =  1  and  K(x)  =  0ViG  (r«,Tr).  (2.4) 

There  are  only  four  possible  cases  that  model  (2.1)  implies:  (1)  r;  >  — oo  and  rr  =  oo, 
(2)  77  =  — oo  and  rr  <  oo,  (3)  — oo  <  r*  <  rr  <  oo,  and  (4)  {x  :  if(x)  =  0}  =  0.  Case 
(1)  is  a  right  censorship  model  as  P (Z  =  oo)  =  1  by  (2.4).  Moreover,  case  (2)  is  a  left 
censorship  model.  Both  of  them  do  not  allow  strictly  interval-censored  observations.  If 
case  (3)  is  true,  then  Y  <  and  Z  >rr  w.p.l,  which  is  not  practically  realistic.  Thus  case 
(4)  is  the  only  practical  case  in  model  (2.1)  that  includes  both  strictly  interval-censored 
observations  and  exact  observations.  In  next  subsection  we  see  how  to  extend  model  (2.1) 
to  cover  more  general  situations. 

2.2.  A  Model  for  More  General  Mixed  IC  Data. 

Even  though  case  (4)  in  Section  2.1  does  not  have  the  drawback  as  in  the  first  three 
cases,  it  implies  that  P{K(X)  >  0}  =  1.  It  is  often  the  case  that  a  study  can  only  last  for 
a  certain  period  of  time,  say,  a  time  interval  [a,  b],  where  0  <  F(a)  <  F(b)  <  1.  In  such  a 
case,  the  mixed  interval-censored  observation  [L,  R\  satisfies 


{L  or  R  G  (— oo,  a)  U  (b,  +oo)}  =  0. 


(2.5) 


Consequently,  P{if(Y)  >  0}  <  P{a  <  X  <  b}  <  1.  Thus,  model  (2.1)  cannot  specify  such 
mixed  IC  data.  Note  that  (2.1)  is  equivalent  to 


.r„,/(y,z]  if  xe(Y,z\, 

1  ’  1  \  [A-,  X]  if  X  (Y,Z], 


(2.6) 


We  now  formulate  a  model  for  mixed  IC  data  satisfying  (2.5).  Assume  Y  <  Z  w.p.l., 
and  {Y  or  Z  G  (— oo,a)  U  (6,  oo)}  =  0.  Suppose  that  X  and  (Y,Z)  are  independent  and 
the  observable  random  vector 


f(Y,Z]  if  Xe(Y,Z], 

[X,X]  if  Jf  ^  (Y,  Z]  and  a  <  X  <b, 

(— oo,  a]  if  X  <£  (Y,  Z]  and  X  <  a, 

.  (6,  oo)  if  X  £  (Y,  Z\  and  X  >  b  . 


(2.7) 


In  the  case  of  (2.5)  or  (2.7),  we  can  only  estimate  F(x)  for  x  in  [o,6],  or  equivalently,  the 
cdf  F*  of  X*,  where  X*  =  al( X  <  a)  +  XI (a  <  X  <  b)  +  2 bI(X  >  b ).  Note  that  X* 
and  (Y,  Z)  are  independent.  Due  to  (2.5)  or  (2.7),  the  right-censored  observation  (6,  oo) 
will  always  be  an  innermost  interval.  The  NPMLE  (or  an  SCE)  F(x)  is  not  uniquely 
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determined  for  x  G  (6,oo)  (see,  e.g.,  Peto,  1973),  though  the  total  mass  assigned  by  the 
NPMLE  (or  the  SCE)  to  the  interval  ( 6 ,  oo)  is  uniquely  determined.  Thus  we  can,  without 
loss  of  generality  (WLOG),  assume  that  the  mass  is  put  at  the  point  26  (G  (6,  oo)).  In 
other  words,  (6,  oo)  can  be  treated  as  an  exact  observation  [26,26].  For  a  similar  reason, 
the  left-censored  observation  (— oo,a)  can  be  treated  as  an  exact  observation  [a, a].  Thus 
model  (2.7)  is  equivalent  to 


,,  Rl-JffZ]  if  *•  €  (V,  Z], 

L  ’  J  \  [X‘,X‘]  if  A'*  t  (Y.zj, 


(2.8) 


If  F(a)  =  0  and  F(b)  =  1,  then  models  (2.7)  and  (2.8)  are  the  same  as  (2.1)  (or  (2.6)). 

In  view  of  (2.6)  and  (2.8),  it  is  easy  to  see  that  in  the  case  of  (2.5)  or  (2.7),  in  order  to 
estimate  F ,  it  suffices  to  estimate  F*,  which  reduces  model  (2.7)  to  model  (2.1).  Similar 
modification  can  be  made  to  handle  the  situation  that  there  are  no  observations  L  or  R 
in  a  union  of  arbitrary  intervals.  In  view  of  the  above  discussion,  we  shall  focus  on  model 
(2.1)  for  the  rest  of  the  paper. 


3.  Consistency  In  Case  Of  Finite  Support  For  F 

In  this  section,  we  assume  that  both  the  support  of  X,  say  Sp,  and  the  support  of 
Y  and  Z,  say  So,  contain  finitely  many  points.  The  generalization  of  F  to  an  arbitrary 
distribution  function  is  given  in  Section  4.  The  assumption  concerning  Sq  is  a  reasonable 
one.  In  practice  inspections  of  most  follow-up  studies  are  recorded  on  a  discrete  time 
scale  (daily,  weekly,  monthly,  etc.),  and  the  total  study  period  is  finite,  so  the  number 
of  censoring  points,  i.e.  the  support  of  Y  and  Z,  is  also  finite.  Such  an  assumption  was 
adopted  by  Finkelstein  (1986)  and  Becker  and  Melbye  (1991),  among  others. 

Suppose  that  X  takes  on  values  x\,  X2,  ■  ■  ■ ,  xu,  and  [L,  R\  takes  on  values  I\  =  [Z°,  rf  J , 
h  —  U2>r2.l>  •  •  •  >  =  [/°>rt)J  with  probability  e*  =  P{L  =  l°,R  =  r° }  >  0.  Based 

on  the  assumption  that  K(x)  >  0  for  all  x  >  0,  Chang  and  Yang  (1987)  and  Gu  and 
Zhang  (1993)  proved  the  consistency  of  the  SCE  for  doubly-censored  observations.  In 
this  paper,  we  weaken  this  assumption  and  prove  the  consistency  of  the  SCE  on  the  set 
O  =  {x\K(x)  >  0}  with  mixed  IC  data. 

For  a  point  x  satisfying  K(x)  =0  and  P{X  =  x)  >  0,  since  there  are  no  exact 
observations  available  at  this  point,  the  distribution  function  F  is  not  estimable,  and  hence 
consistency  cannot  be  assessed.  Let  us  consider  the  structure  of  the  innermost  intervals 
as  sample  size  n  — *  oo.  For  Xi,  if  K(xi)  >  0,  then  it  follows  from  the  strong  law  of  large 
number  that  P{Xk  7^  for  all  k  =  1,2, . . .  ,n}  0  as  n  00.  In  other  words,  K(xi) 

>  0  implies  that  [xi,Xi]  is  an  innermost  interval  w.p.l,  which  further  implies  that  the 
union  of  all  closed  innermost  intervals  coincides  with  the  set  O.  Let  A\,  A2, . . . ,  Ami  be 
the  innermost  intervals  induced  by  the  intervals  R,  i  =  1, 2, . . . ,  v,  and  call  them  population 
innermost  intervals.  It  is  seen  that  as  n  -7  00  the  set  of  sample  innermost  intervals  induced 
by  [Li,Ri\,i  =  1,2,  ...,n  converges  almost  surely  to  the  set  of  population  innermost 
intervals.  Since  we  are  only  concerned  with  large  sample  properties,  we  can,  WLOG, 
assume  that  the  sample  size  is  large  enough  so  that  m  —  m\.  For  the  rest  of  the  paper,  m 
will  be  used  to  denote  both  the  number  of  population  innermost  intervals  and  the  number 
of  sample  innermost  intervals.  Also  we  shall  suppress  the  qualifier  w.p.l  throughout  the 
rest  of  the  paper  to  avoid  repetition. 
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Let  sn  =  (sni,  sn2, . . . ,  snm )  be  a  solution  of  (2.2).  For  sufficiently  large  n,  the  sni’s  are 
the  masses  assigned  to  the  innermost  intervals  by  the  SCE.  Since  { Hn ,  n  >  1}  is  a  bounded 
monotone  sequence,  it  follows  from  Helly-Bray  selection  theorem  that  there  exist  a  subse¬ 
quence,  say  {nfc},  of  integers,  a  function  if  and  a  vector  s,  such  that  limnfc_).00  Hnk(x)  = 
H(x)  and  limnfc_HX)snfc  =  s  =  (si,  s2,  ■  ■  ■ ,  sm).  Taking  the  limit  in  (2.2)  and  (2.3)  with 
respect  to  rife,  we  obtain 


S-ijSj 

S  'i  y  ^  5  Sj 


v-E 


/  S’kSk’  ^  y  sj  1? 


(3.1) 


where  $ij  =  I(Aj  C  [If,  rf J ) ,  and 

H(x)  -  H(l ) 
l<x<r  W)  -  »(l) 


H(x)  =  J 


dQ(l,  r)  +  P{R  <  x},  if  G  @, 


(3.2) 


since  QUk  converges  to  Q  almost  surely  as  n*,  — >  oo. 

We  state  two  lemmas.  The  proof  of  Lemma  1  is  relegated  to  the  appendix.  Hereafter, 
the  discussion  regarding  uniqueness  of  the  solution  if  (x)  of  Eq.  (3.2)  will  be  restricted  to 
the  set  O. 


Lemma  1  Let  s°  =  (s°,  s where  s°  =  P{X  G  Aj},  j  =  1, 2, . . . ,  m.  Then  s  =  s° 

is  the  unique  solution  of  Eq.  (3.1). 

Lemma  2  (Li,  Watkins  and  Yu,  1997)  Let  dH  be  the  measure  induced  by  a  c.d.f.  if  and 
Sj  =  dH(Aj)  for  all  j.  Then  s  =  (si, . . . ,  sm )  is  a  solution  to  Eq.  (3.1)  if  and  only  if  if  is 
a  solution  to  Eq.  (3.2). 

Theorem  1  Suppose  Sp  and  Sq  contain  finitely  many  points.  Then  (1)  F  is  the  unique 
solution  of  Eq.  (3.2)  for  x  G  O,  and  (2)  the  SCE  Hn(x)  of  F(x)  satisfies  supx€C,  \Hn{x)  — 
F(x) |  — »  0,  as  n  —¥  oo. 

Proof.  Since  s°  =  dF(Aj),  F  is  a  solution  of  Eq.  (3.2)  by  Lemmas  1  and  2.  Mean¬ 
while,  for  each  solution  if  of  (3.2),  dH  is  uniquely  determined  by  Lemmas  1  and  2  again. 
Consequently,  statement  (1)  follows. 

It  follows  from  the  convergence  of  Hnk  and  (1)  above  that  HUk(x)  — »  F(x)  as  nu  — >•  oo 
for  x  G  O.  The  convergence  of  Hn(x)  to  F(x)  for  x  G  O  follows  from  Helly-Bray  selection 
theorem.  The  uniform  convergence  of  Hn  is  immediate  by  the  assumption  that  Sp  and 
Sq  contain  only  finitely  many  points.  □ 

REMARK.  Even  though  F(x)  is  not  estimable  for  x  G  (77,  rr),  it  is  easy  to  see  that 
if(rr)  —  H(ti)  =  F(rr)  —F(t{).  This  remark  is  also  valid  for  model  (2.7).  Moreover,  under 
model  (2.7)  F(x)  is  not  estimatable  for  x  <  a  or  x  >  b. 

4.  Consistency  Of  Hn  For  Arbitrary  F 
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In  this  section,  we  extend  the  result  of  the  previous  section  to  the  case  where  G  is  the 
same  as  previously  defined  but  F  is  arbitrary. 

Theorem  2  Suppose  that  (Y,  Z)  takes  on  finitely  many  values  and  F  is  arbitrary.  Then 
the  solution  to  Eq.  (3.2)  is  unique.  Furthermore,  Hn(x)  converges  to  F(x)  uniformly  for 
all  x  G  (D  . 

Proof.  The  main  idea  of  the  proof  is  to  partition  the  interval  [0,  oo)  into  finitely  many 
subintervals,  and  then  to  prove  the  consistency  of  the  SCE  for  every  such  subinterval. 

WLOG,  assume  that  the  values  that  (Y,  Z)  can  take  are  ( ai,bi ),  i  =  1,  ...,  No,  for 
some  integer  No-  Rank  the  2Nq  values  {a,,  b,}  in  increasing  order  to  obtain  a  sequence 
(ties  and  infinity  are  allowed).  Let  di  <  d2  <  •  •  •  <  dpr  (N  <  2 No)  be  the  distinct  finite 
values  of  the  sequence.  We  first  partition  [0,  oo)  into 


[0, 0],  (0,  di),  [di,di],  (d\,d2),... ,  [d;v,  dpr],  (djv,  oo).  (4.1) 

Note  that  in  this  partition,  all  exact  observations  in  the  same  interval  (dj,dj+ 1)  (or  [dj,dj]) 
carry  the  same  weight.  This  is  because  for  any  observed  interval,  if  (dj,dj+i)  (or  [dj,dj]) 
is  not  a  subset  of  the  interval,  then  it  is  disjoint  from  the  observed  interval,  and  because 
the  weight  received  by  an  innermost  interval  is  determined  by  all  the  observed  intervals 
that  cover  the  innermost  interval  (see  (2.2)). 

For  a  fixed  e  >0,  if  there  is  a  value  d  in  an  open  interval  (dj,dj+ 1)  such  that  P{X  = 
d}  >  e,  divide  the  interval  into  (dj,d),[d,d],(d,dj+i).  Perform  the  partitioning  for  every 
such  d.  Since  the  set  of  such  d  values  is  finite,  the  total  number  of  intervals  partitioning 
[0,  oo)  must  also  be  finite.  WLOG,  assume  (4.1)  is  the  final  partition  at  this  stage. 

Consider  an  interval,  say,  (d\,d2),  such  that  F(d2—)  —  F(di)  >  0.  For  this  fixed  e, 
partition  (d\,d2)  into  subintervals,  say  (ci,  c2),  [c2,  c2],  (c2,  C3, ), . . . ,  (c*,,  Ck+i)  (ci  =  d\  and 
Ck+ 1  =  d2)  such  that  F(ci+\—)—F(ci)  <  e  for  i  =  1, 2, . . . ,  k.  Perform  this  second  partition 
for  every  interval  (di,  dj+i)  and  [di,  df\  for  alii  =  0, 1, . . . ,  IV,  where  do  =  0. 

From  now  on,  we  focus  our  discussion  on  (d\,d2).  The  argument  for  other  intervals 
is  similar.  Let  c\  be  the  midpoint  of  the  interval  (cj,Cj+i),  i  =  1,  ...,  k,  and  construct  a 
new  (pseudo)  distribution  function  F'  with  finite  support,  F'(di)  =  F(dj)  and  F'(c')  — 
F' (di — )  =  F(ci+ 1— )  —  F(ci ),  for  all  i.  It  is  readily  seen  that 

sup  |F(x)  -  F'(x)\  <  e.  (4.2) 

X 

It  can  be  verified  that  if  (ti,  rr)  is  not  an  empty  set,  then  one  of  (di,  di+i)  must  be  (77,  rr), 
due  to  the  special  structure  of  partition  (4.1).  In  addition,  since  consistency  is  restricted 
to  O ,  it  is  natural  to  assume  that  Sp  n  (d\,d2)  C  O.  Moreover,  since  F(d2—)  —  F(d\)  >  0 
and  Sp  n  (d\,d2)  C  O,  the  probability  of  having  exact  observations  in  (d\,d2)  converges 
to  one  as  n  —¥  00.  Thus,  for  n  large  enough,  we  can  eventually  observe  exact  observations 
in  the  interval  and  hence  (^1,^2]  cannot  be  an  half  open  innermost  interval. 

Consider  a  pseudo  random  variable  X'  —  6  (cj,Cj+ 1)  +  CiI(X  —  cf)]. 

Note  that  X'  has  the  distribution  function  F' .  Suppose  there  are  h  exact  observations 
in  (cj,Cj+i),  then  the  pseudo  random  variable  X'  will  assume  the  value  d{  as  an  exact 
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observation  a  total  of  h  times.  For  sample  size  n,  let  wni  denote  the  weight  received  by- 
all  exact  observations  in  (ci,c;+i),  and  let  w'ni  —  wni  be  the  weight  received  by  c\  from 
the  pseudo  observations  generated  by  X' .  Let  H'n  denote  the  corresponding  SCE  of  F' 
associated  with  X'  and  Hn  the  SCE  of  F  associated  with  X.  It  is  easy  to  see  that  for  each 
i  and  A  =  (di,di+i)  or  [d*,di], 


dHn(Di)  =  dH'^Di).  (4.3) 

By  the  results  of  the  previous  section  and  the  finiteness  of  support  of  F',  H'n(di )  — >  F'(d{) 
for  each  i  as  n  -4  oo.  Thus,  it  follows  from  (4.2)  and  (4.3)  that  when  n  is  large  enough, 

sup  | F(x)  -  Hn(x) |  <  e,  (4.4) 

xeo 

which  proves  the  consistency  of  the  SCE  Hn.  By  Helly-Bray  selection  theorem  and  the 
fact  that  F  is  a  solution  of  (3.2),  (4.4)  also  shows  that  F  is  the  unique  solution  of  (3.2).  □ 

Appendix 

In  this  appendix,  we  give  a  proof  of  Lemma  1.  To  show  the  uniqueness  of  the  solution 
of  (3.1),  consider  the  generalized  log-likelihood  function 

v  m 

A  =  A(s)  =  ^2  e*  (ln  S  S'i8i)  ■ 
i=  1  j= 1 

It  follows  from  (2.2)  that  l  $ijsj  >  0  for  every  i,  1  <  i  <  v  and  hence  the  function 
A  is  well  defined.  Let  se  =  (sf,S2,...,s^)  denote  the  solution  of  Eq.  (3.1)  and  s°  = 
(s£,  •  •  • )  sm)  f^e  probability  vector  with  s°  =P{Af  e  A{\. 

To  facilitate  the  proof  of  Lemma  1,  we  first  establish  three  lemmas.  Following  the 
notations  of  Section  3,  let  R  =  and  e*  =  P{[L,R\  =  R}  =  a°gi,  where 


a°  =  P{X  G  R}  =  6ijs°j>  i  =  !» •••»  (A.l) 

j= 1 

and  g,  =  P{  [ Y,  Z\=h}  if  if  <  rf  and  P{if  t  (Y,  Z))  if  if  =  rf .  Thus  of  gt  =  Zi  = 
1.  It  is  seen  that  s°  uniquely  determines  a°  =  (aj,  o^j  •  •  •  >  a°)-  Let  a  =  (ai>  •  •  •  >  &v),  with 
oti  =  XlJLi  dijSj.  Then  A  can  be  rewritten  as 


V 

A(s)  =  ^  et  ln  cti  =  A(a). 

i—  1 

Thus  maximizing  A(s)  is  equivalent  to  maximizing  A(a).  Note  that  a-,  and  Qi  are  fixed, 
but  cti  are  not. 
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Let  Xs  be  a  random  variable  such  that  P{XS  e  Aj}  =  Sj,  j  >  1.  Define  a  new  random 
interval  [ Ls ,  Rs J  to  be  the  counterpart  of  [L,  R\  in  (2.1)  with  X  replaced  by  Xs.  Thus  oti 
satisfies  ongi  =  P{  [Ls,  Rs\  =  R}.  It  follows  that 

Y  ai9i  =  1-  (A-2) 

i 

Lemma  Al  Suppose  a  satisfies  (A.  2).  Then  A  (a)  is  uniquely  maximized  by  a°,  where 

<  =  ei/gi- 

Proof.  Let  R  =  ,  i  =  l,...,v.  Then  R  €  [0,1]  and  J2i  U  =  1-  Let  h(t)  =  Yli=i  ^lnt*. 

Note  that  _ 

M*)  =  e*  In  a*  +  ^  e;  In  gt  =  A(a)  +  e*  In  gt 

i  i  i 

and  Yli  ei  ln  9i  is  fixed  under  the  given  assumption.  Hence,  maximizing  A(a)  is  equivalent 
to  maximizing  h{ t).  It  can  be  shown  that  h( t)  is  uniquely  maximized  by  R  =  e^,  i  >  1. 
Therefore,  the  unique  maximizer  for  A  (a)  is  a°  =  ei/gi.  □ 

Lemma  A2  s°  is  the  unique  maximum  point  of  A(s). 

Proof.  Following  the  notations  in  Lemma  Al,  we  have  A(s)  =  A((a(s)).  By  Lemma  Al  and 
the  equality  a(s°)  =  a°  (due  to  (A.l)),  s°  is  a  maximum  point  of  A(s).  By  the  finiteness 
assumption  on  Sp  and  Sq,  each  population  innermost  interval  is  a  realization  of  [L,R\, 
say,  Aj  =  R.,  except  perhaps  (r;, rr)  if  (rj,rr)  is  not  an  empty  set.  Thus  (A.l)  implies 
that  there  are  at  least  m  —  1  (out  of  v )  js  such  that  a°.  —  s°.  Since  A((a(s))  is  uniquely 
maximized  by  a°,  s°  is  the  unique  maximum  point  of  A(s).  □ 

Lemma  A3  >  0  implies  that  >  0. 

Proof.  Note  that  for  each  k ,  if  P{X  €  A*,}  >  0,  then  there  is  an  integer  h  such  that 
Ih  =  Ak  and  thus  5hk  =  0  if  h  ^  k  and  1  otherwise.  For  j  =  k,  Eq.  (3.1)  yields 

s ® 

s%  >  eh-j  =  eh  >  0  (since  Ih  =  Ak  and  eh  >  0). 

sk 

This  completes  the  proof  of  the  lemma.  □ 


We  are  now  ready  to  prove  Lemma  1.  By  Lemma  A2,  s°  is  the  unique  maximizer 
of  A.  Thus,  to  prove  the  lemma,  it  suffices  to  show  that  s°  =  se.  Consider  the  effect  of 
increasing  a  particular  component,  sk,  by  a  small  amount  u  and  then  dividing  all  the  Sj, 
including  sk  +  u,  by  1  +  u  in  order  to  ensure  that  the  components  of  s  sum  to  1.  Let 

*<s)  =  uL=o- Then 


OU  1  +  U 


Sk+U 

1  +  u  ’■ 


■’  1  +  u 


u= 0 
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2=1 

V 


=Ee‘flrlnEfcs 


r 

IjCj 


3=1 


u=0 


(or  =  j  Sj/(l  +  u)  if  jy^k\ 

V  j  \  ( sk  +  u)/(l  +  u )  if  j  =  k  ) 


=E 


X"  i « 


^2  ' 


<L  Qr 

%3  du  b  j 


2=1 


12=0 


— E4- 

2=1 


^ife 


<5- - s 

Z^7  =  l 


fc  =  l,...,m. 


(A3) 


Consider  two  separate  situations  regarding  the  values  of  s£. 

CASE  1.  s£  >  0,  for  all  k. 

If  se  is  a  solution  to  (3.1),  then  it  follows  that  s\  >  0  for  all  k.  Consequently, 


0  =  ]C ei f1  ~  — e)>  k=l,...,m,  (A. 4) 

i= i  2^j=i  0iis3 

since  ^i=i  ei  =  1-  In  view  of  (A.3)  and  (A. 4),  c4(se)  =  0  and  s|  >  0  for  each  k.  Therefore, 
se  is  the  maximum  point  of  L.  By  Lemma  A2,  se  =  s°. 

CASE  2.  =  0  for  some  k. 

WLOG,  assume  that  =  0  and  s £  >  0  for  k  =  1,2, ...,m  —  1.  We  shall  show 
that  >  0  leads  to  a  contradiction.  If  >  0,  then  st  >0  for  all  i  by  Lemma  A3. 
Consequently,  (A. 4)  holds.  By  virtue  of  (A.3)  and  (A.4),  se  is  a  maximum  point  of  A,  and 
it  follows  that  se  =  s°.  However,  this  contradicts  the  hypothesis  that  =  0.  This 

completes  the  proof  for  Case  2  and  thus  the  proof  of  the  lemma.  □ 
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ABSTRACT 

We  consider  the  case  1  interval  censorship  model  in  which  the  survival  time  has  an  arbitrary 
distribution  function  F0  and  the  inspection  time  has  a  discrete  distribution  function  G.  In  such 
a  model  one  is  only  able  to  observe  the  inspection  time  and  whether  the  value  of  the  survival 
time  lies  before  or  after  the  inspection  time.  We  prove  the  strong  consistency  of  the  generalized 
maximum-likelihood  estimate  (GMLE)  of  the  distribution  function  F0  at  the  support  points  of 
G  and  its  asymptotic  normality  and  efficiency  at  what  we  call  regular  points.  We  also  present  a 
consistent  estimate  of  the  asymptotic  variance  at  these  points.  The  first  result  implies  uniform  sttong 
consistency  on  [0,  oo)  if  F0  is  continuous  and  the  support  of  G  is  dense  in  [0,  oo).  For  arbitrary 
F0  and  G,  Peto  (1973)  and  Turnbull  (1976)  conjectured  that  the  convergence  for  the  GMLE  is 
at  the  usual  parametric  rate  nl  Our  asymptotic  normality  result  supports  their  conjecture  under 
our  assumptions.  But  their  conjecture  was  disproved  by  Groeneboom  and  Wellner  (1992),  who 
obtained  the  nonparametric  rate  n*  under  smoothness  assumptions  on  the  Fo  and  G. 

RESUME 

Nous  considerons  le  modele  de  censure  d’intervaile  de  cas  1  dans  lequel  le  temps  de  survie 
a  une  fonction  de  repartition  arbitraire  Fo  et  le  temps  d’ inspection  a  une  fonction  de  repartition 
discrete  G.  Dans  un  tel  modele  on  est  seulement  capable  d’observer  le  temps  d’inspection  et  si 
la  valeur  du  temps  de  survie  est  superieure  ou  inferieure  le  temps  d’inspection.  Nous  prouvons 
convergence  forte  de  I’estimateur  du  maximum  de  vraisemblance  generalise  (GMLE)  de  la  fonction 
de  repartition  Fo  aux  points  de  support  de  G  et  sa  normalite  asymptotique  et  1  efficacite  a  ce  que 
Ton  appelle  les  points  reguliers.  Nous  presentons  egalement  un  estimateur  convergent  de  la  variance 
asymptotique  a  ces  points.  Le  premier  resultat  implique  une  convergence  uniforme  forte  sur  [0,  oo) 
si  F0  est  continu  et  le  support  de  G  est  dense  en  (0,oo).  Pour  des  F0  et  G  arbitrages,  Peto  (1973) 
et  Turnbull  (1976)  ont  conjecture  que  la  convergence  du  GMLE  est  au  taux  parametrique  habituel 
de  7i i.  Notre  resultat  de  normalite  asymptotique  supporte  leur  conjecture  sous  nos  hypotheses. 
Mais  leur  conjecture  a  ete  refutee  par  Groeneboom  et  Wellner  (1992)  qui  ont  obtenu  le  taux 
non-parametrique  de  sous  des  hypothese  de  Fq  et  G  lisses. 

1.  INTRODUCTION 

In  survival  analysis,  one  frequently  is  unable  to  precisely  observe  the  survival  time  X 

♦This  work  was  partially  supported  by  NSF  Grant  DMS-9402561  and  DAMD17-94-J-4332  (Q.Y.),  LEQSF 
Grant  357-70-4107  (L.L.),  and  DAMD17-94-J-4332  (G.Y.C.W.). 
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of  interest,  but  can  only  assess  that  it  belongs  to  some  random  interval.  The  simplest  such 
model  is  the  so-called  case  1  interval-censorship  model.  In  this  model  one  is  only  able 
to  observe  a  random  time  F  and  whether  x  lies  in  the  random  interval  [0,  F]  or  (F,oo). 
More  formally,  one  observes  (F,  A),  where  A  =  /[X  <  F].  Here  and  below  I[A]  denotes 
the  indicator  function  of  the  event  A .  The  random  time  Y  is  called  the  inspection  time. 

Such  data  arise  in  industrial  life  testing  and  medical  research.  Consider  for  example  an 
animal  sacrifice  study  in  which  a  laboratory  animal  has  to  be  dissected  to  check  whether 
a  tumour  has  developed.  In  this  case,  X  is  the  onset  of  tumour  and  Y  is  the  time  of  the 
dissection,  and  we  only  can  infer  at  the  time  of  dissection  whether  the  tumour  is  present 
or  has  not  yet  developed.  Other  examples  are  mentioned  in  Ayer  et  al  (1955),  Keiding 
(1991)  and  Wang  and  Gardiner  (1996). 

We  shall  assume  throughout  that  the  lifetime  X  and  the  inspection  time  Y  are  indepen¬ 
dent  and  denote  their  distribution  functions  by  Fo  and  G,  respectively.  Our  data  consist 

of  n  independent  copies  (F/,A,)  =  (F„/[X,  <  F,]),  i  =  1 . n,  of  (F,  A).  We  consider 

estimating  (characteristics  of)  the  distribution  function  Fo  based  on  these  data. 

Ayer  et  al.  (1955)  derived  the  explicit  expression  of  the  generalized  maximum- 
likelihood  estimator  (GMLE)  of  the  distribution  function  Fo.  Moreover,  they  established 
the  weak  consistency  of  the  GMLE  at  continuity  points  x  of  Fo  under  additional  assump¬ 
tions  on  G.  They  also  mentioned  the  strong  consistency  of  the  GMLE  at  each  support 
point  of  a  discrete  F  with  finitely  many  values.  Using  an  inequality  of  theirs,  we  shall 
generalize  this  result  to  arbitrary  discrete  F  in  our  Theorem  2.1.  From  this  result  we 
shall  derive  the  uniform  strong  consistency  on  the  entire  line  if  Fo  is  continuous  and  the 
support  of  F  is  dense  in  the  positive  half  line.  Moreover,  using  Theorem  2.1  of  Ayer  et 
al  (1955),  we  shall  derive  another  explicit  representation  of  the  GMLE  at  what  we  call 
regular  points  and  conclude  with  its  aid  the  asymptotic  normality  and  efficiency  of  the 
GMLE  at  such  points. 

Peto  (1973)  considered  the  problem  of  obtaining  the  GMLE  based  on  interval-censored 
data  using  a  Newton— Raphson  algorithm.  Turnbull  (1976)  proposed  a  self-consistent 
algorithm  and  showed  that  it  converges  to  the  GMLE  F.  Both  conjectured  that  for 
arbitrary  F0  and  G,  the  GMLE  is  asymptotically  normal  at  the  usual  n £  rate.  Thus 
our  results  provide  a  partial  justification  of  their  claim  for  discrete  F.  It  was,  however, 
shown  by  Groeneboom  and  Wellner  (1992)  that  this  conjecture  is  false  if  Fo  and  G 
satisfy  certain  smoothness  assumptions.  Indeed,  their  Theorem  5.1  establishes  that  under 
differentiability  assumptions  on  Fo  and  G  the  convergence  is  at  the  slower  n *  rate  and 
the  limiting  distribution  is  not  normal.  Groeneboom  and  Wellner  (1992)  also  obtained 
the  uniform  strong  consistency  of  the  GMLE  for  continuous  F0  and  G.  A  variant  of  this 
result  was  also  proved  by  Wang  and  Gardiner  (1996)  using  a  totally  different  approach 
and  a  slightly  different  set  of  assumptions. 

The  results  of  Groeneboom  and  Wellner  (1992)  give  a  fairly  detailed  description  for 
the  case  of  continuous  F0  and  G,  while  ours  do  so  for  the  case  of  arbitrary  F0  and  discrete 
G.  There  are  many  practical  situations  in  which  F  is  discrete.  In  medical  research,  for 
example,  the  data  are  often  recorded  as  integers  (to  represent  number  of  days,  weeks  etc.). 
Motivated  by  this,  we  assume  that  the  inspection  time  F  is  a  discrete  random  variable 
with  density  g.  This  assumption  is  used  by  several  authors  in  survival  analysis:  Becker 
and  Melbye  (1991)  and  Finkelstein  (1986)  among  others. 

Our  paper  is  organized  as  follows.  We  introduce  the  GMLE  in  Section  2  and  prove 
its  strong  consistency.  In  Section  3  we  establish  the  asymptotic  normality  and  efficiency 
of  the  GMLE  at  what  we  call  regular  points.  Finally,  Section  4  summarizes  our  work, 
discusses  some  of  its  implications,  addresses  some  questions  raised  by  it  and  establishes 


1998 


CASE  1  MODEL 


621 


connections  with  the  work  of  others.  In  particular,  we  show  by  means  of  an  example 
that  our  asymptotic  normality  result  fails  at  nonregular  points  even  though  the  rate  of 
convergence  is  still  n L 


2.  THE  CONSISTENCY  OF  THE  GMLE 

By  our  assumptions,  Y  is  a  discrete  random  variable  with  density  g.  Let  Si  be  the  set 
of  possible  values  of  Y,  i.e.,  Si  =  {a  E  R  :  g{a)  >  0}.  For  a  ESI,  set 


N-(a)=l-Y/HXj<a,  Yj  =  a], 

W»  =  \StI[Xi>a'  Yi  =  a J’ 


K(a)  =  I  2  I[Yj  =  a]. 

J=l 


The  generalized  likelihood  is  given  by 

A„(  F)  =  J]  F(a)nN"‘(a){l  -  F(a)}'"v». 
aen 


In  the  above  we  let  F  range  over  the  set  J  of  all  subdistribution  functions.  A  function  F 
is  called  a  subdistribution  function  if  F  =  aF\  for  some  distribution  function  F\  and  some 
number  a  in  [0, 1].  Thus  a  subdistribution  function  has  all  the  properties  of  a  distribution 
function  except  that  its  limit  at  infinity  may  be  less  than  1. 

Note  that  A n(F)  depends  on  F  only  through  the  values  of  F  at  the  points  a  E  SI  for 
which  Nn(a)  >  0.  Thus  there  exists  no  unique  maximizer  of  A n(F)  in  the  set  J .  But  there 
exists  a  uniquely  determined  J  -valued  random  element  Fn  which  maximizes  A„(  F)  and 
satisfies  Fn(b)  =  sup{  Fn(a)  :  a  <  b,  Nn(a)  >  0}  for  each  b  E  R.  Here  we  interpret  the 
supremum  of  the  empty  set  as  0.  We  call  F„  the  GMLE  of  F0.  It  is  easy  to  check  that 
Wd)  =  0  on  the  event  {N“(T(i))  =  0}  and  Fn(Yin))  -  1  on  the  event  {Nj(Y{n))  =  0}, 
where  y(D  and  Yin)  are  the  smallest  and  largest  among  Fj,...,  Yn.  For  latter  use,  set 

~  /  N-(a)/Nn(a)  if  Nn(a)>  0, 

l  0  otherwise. 

Theorem  2.1.  The  GMLE  Fn  satisfies  Fn(a)  — >  Fo(a)  almost  surely  for  each  a  ESI. 
Proof  We  use  the  following  inequality  given  in  Ayer  et  al.  (1955,  p.  644): 

X){  -  W}2N„(a)  <  ^{F„(a)  -  F0(a)}2yV„(a)- 

aeJA  UCJA 


We  get 

Fn(«)  -  F0(a)}2Nn(a)  <  ^  pV,(a)  -  g(a)\  +  F„(a)  -  F0(a)}2g(a). 

a  ESI  aES\  uESA 
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It  follows  from  the  SLLN  that  for  each  a  6  ft,  N„(a )  —  g(a)  and  F„(a)  ->  F0(a)  almost 
surely.  Thus  Scheffe’s  theorem  (see  Billingsley  1968,  p.  224)  implies 

/,  |N„(a)  —  g(a)  |  — ♦  0  almost  surely 

and  the  Lebesgue  dominated-convergence  theorem  implies 

Fn(a)  —  Fo(a)}2g(a)  — *  0  almost  surely. 


It  follows  that  {  Fn{a)  ~  Fo(a)}2N„(a)  — >  0  almost  surely.  This  yields  the  desired 
result,  as  N„(a)  is  eventually  positive  with  probability  1  for  each  a  6  ft.  □ 

The  above  result  was  already  observed  by  Ayer  et  al.  (1955)  in  the  case  when  ft  is 
finite.  In  this  case  one  can  even  conclude  that  the  GMLE  is  uniformly  strongly  consistent 
on  ft,  i.e.,  supa€^  |  F„(a)  —  F0(a)|  — » 0  almost  surely.  For  countably  infinite  FI,  however, 
additional  assumptions  are  required  to  conclude  this,  as  demonstrated  by  the  following 
.  example. 

Example  2.2.  Suppose ■#  =  { y,  :  y,  =  1  -  1  //,  /  >  1 }  and  G(y)  =  y  for y  e ft.  Then 
the  GMLE  will  not  be  uniformly  strongly  consistent  on  ft  if  0  <  F(l— )  <  1. 

Proof.  Let  Q„  =  |J?=,  <  Yh  Yj  <  Y,}.  Then  Q„  C  {N+(YW)  =  0}.  Since 

Fni  Y(n>)  —  1  on  the  event  {N+(  Y(n))  =  0},  as  observed  prior  to  Theorem  2.1,  and  since 
^o(l-)  <  L  we  cannot  have  uniform  strong  convergence  if  lim  inf,,^*,  P(Q„)  >  0.  But 

P(Q„)  =  nP  <  Yu  Yj  <  F,}j  >  n  Fo(yn){G(yn—)}n~lP(  Y\  >yn) 

so  that  by  the  choice  of  ft  and  G 

lim  inf  P(Qn)  >  lim  inf  F0(y„ )  ( 1 - —  )  =  >  0 

n  *°°  n— 00  V  n—\)  e 

Consequently,  the  GMLE  is  not  uniformly  consistent  on  PL.  □ 

We  now  address  the  uniform  strong  consistency. 

Corollary  2.3.  Suppose  the  set  FI  is  closed.  Assume  that  F0(a- )  =  F0(a)for  each  a  eft 
for  which  there  is  a  strictly  increasing  sequence  of  points  {a,}/>i  in  ft  such  that  aj  f  a. 
Then  the  GMLE  is  uniformly  strongly  consistent  on  ft. 

Proof.  Let  m  be  a  positive  integer.  Let  ft,  =  {a  6  ft  :  <a  <  x,},  i  =  1 . m, 

where  x0  =  -oo,  xm  =  oo  and  xt  =  inf{x  :  F0(x)  >  i/m},  i  =  1,. .  ,,m  -  1.  Let  a  eft. 
Then  a  e  ft  ,•  for  some  /  =  1 , . . . ,  m.  Since  ft  is  a  closed  set,  a,  =  inf  ft ,  and  6,  =  sup  ft , 
belong  to  ft.  Using  the  monotonicity  of  F„  and  F0,  we  find  that 

|F„(a)  -  F0(a)|  <  max{|F„(6,)  -  F0(6,)|,  |  Fn{a,)  -  F0(a,)|}  +  F0(b,)  -  F0{a,). 

If  b,  <  x„  then  F0(fo,)  -  F0(a,)  <  1/m.  If  b,  =  xh  then  F0(x,)  =  F0(jc,-)  =  i/m  and 
Fo(bi)  -  F0(o,)  <  1/m.  This  shows  that  lim  supn_M  supa£5l  |F„(a)  -  F0(a)|  <  1/m  on 
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the  event  Q,  =  f]a^  {I'^oo  W  =  F0(a)}.  Since  m  is  arbitrary  and  P( Q.)  =  1  by 
Theorem  2.1,  we  obtain  the  desired  result.  □ 

In  the  next  corollary,  the  set  !A.  need  not  be  closed. 

Corollary  2.4.  Assume  that  FI  =  {a,},>i,  where  a ,•  <  aM  for  all  i.  Let  x  =  sup,  ah  If 
f0(x_)  =  i,  then  the  GMLE  is  uniformly  strongly  consistent  on  FI. 

Proof.  Let  m  be  a  positive  integer.  Since 

sup  I  F„(a)  -  F0(a)|  <  max  |  F„(a,)  -  F0(a,)|  +  1  -  F0(am), 
aex  1 

it  follows  from  Theorem  2.1  that  lim  sup^oo  supa£)?  |  Fn(a)  -  F„(«)|  <  1  -  F(am).  The 
desired  result  follows,  as  m  is  arbitrary  and  Fo(x— )  =  1 .  d 

We  call  a  number  x  a  point  of  increase  of  F0  if  either  F0(x)  <  F0(y  )  for  all  y  >  x  or 
Fo(y)  <  Fo(x)  for  all  y  <  x.  Note  that,  for  each  a  in  the  interval  (0, 1),  the  left  quantile 
f- >(a)  =  inf{y  :  F(y)  >  a}  is  a  point  of  increase  of  F0. 

Corollary  2.5.  Suppose  that  F0  is  continuous  and  the  closure  of  FI  contains  the  set  S 
of  all  points  of  increase  of  F0.  Then  the  GMLE  is  uniformly  strongly  consistent,  i.e., 
sup*eR  I  F„(x)  -  F0(x)|  —  0  almost  surely. 

Proof.  Let  F\, F2, . .  •  be  subdistribution  functions  such  that  Fn(a)  — *  F0(a)  for  all  a  e  FL. 
Let  m  be  a  positive  integer.  Since  F0  is  continuous,  there  are  points  x\  <  ■  ■  •  <  xm  in  S  such 
that  F0(x,)  =  i /(m  +  1).  The  continuity  of  Fo  and  the  fact  that  the  closure  of  FI  contains 
S  imply  that  there  are  points  a\  <  ■  ■  ■  <  am  in  FL  such  that  |  Fo(a,)  —  Fo(x,)|  <  \/m  . 
Using  this  and  the  monotonicity  of  F0  and  F„  we  derive  that 

|  Fn(x)  -  F0(x)|  <  m  |  Fn(ad  -  F0(a,)|  +  K  x  e  R. 

This  shows  that  F„  converges  to  F0  uniformly.  . 

By  the  above,  the  events  {  Fn(a)  F0(a)}  and  {supj6R  |  F„(x)  —  Fo(x)\  — >  0} 
are  identical  and  thus  have  probability  1  by  Theorem  2.1.  □ 

3.  THE  ASYMPTOTIC  NORMALITY  OF  THE  GMLE 

We  shall  now  discuss  asymptotic  normality  and  efficiency  of  Fn(x)  for  regular  points 
x  as  defined  next.  Let  ^  U  {— oo,  oo}.  For  x  G  OL,  set 

X-  :=  sup{a  6  Flt  :  a  <  x}  and  x+  :=  inf{a  G  Ft,  .  a  >  x}. 

We  say  x  is  a  regular  point  if  x  belongs  to  J4,  x_  and  x+  belong  to  FLt  ,  x_  <  x  <  x+  and 
Fo(x_)  <  Fo(x)  <  Fo(x+).  It  is  worth  mentioning  that  there  may  be  infinitely  many  regular 
points.  For  example,  if  F0  is  strictly  increasing  and  FI  is  the  set  of  all  positive  integers, 
then  every  positive  integer  is  a  regular  point.  The  conditions  imposed  on  regular  points 
are  somewhat  similar  to  the  assumption  that  F0  and  G  have  positive  and  continuous 
derivatives  needed  in  the  asymptotic  distribution  result  of  the  GMLE  (see  Groeneboom 
and  Wellner  1992).  However,  their  convergence  rate  is  n?,  while  we  shall  show  that  the 

convergence  rate  is  n*  under  our  assumptions. 

Given  a  regular  point  Fn(x)  may  or  may  not  be  the  same  as  Fn(x ),  as  shown  by 
the  following  example.  Suppose  that  F  is  the  exponential  distribution  function  and  FI  = 
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.  w,  i.'u'fx,  Li  niNu  vvuimu  Vo!.  26,  No.  4 

I1,’2!-  2  %K,gul*r  Points- If  a  samPle  of^e  3  consists  of  observations 

1  ’  U  U  ’  2’ -  ^  then  =  (5, 1),  Which  is  the  same  as  (F(l), F(2))  On 

(iTpmT-  Kf  S'Ze  3  COnS‘StS  °f  obs5rvati„ons  {0.0), (1. 1),  (2,0)},  then 

Wi«-  1  ~  5’u°  ’  Wh,Ch  IS  n0t  thC  Same  as  (W'W))  =  (i{).  However,  the 
tends  to  zero'11"13  Sh°WS  ^  tW0  eStimat0rs  differ  only  on  a  set  whose  Probability 

Lemma  3.1.  Suppose  x  is  a  regular  point.  Then  P(Fn(x )  =  F„(x))  — >  1. 

Proof.  Assume  first  that  x-  and  x+  belong  to  FI.  Let  Bn  =  {  FJx  )  <  F  (x)  <  F  (r  U 

an  C.  - ■  «,  )  >  0.  JWfl  >  0.  HJM  >  0}.  I,  ioJsS  Th»l  ^ 

“iL  <  <  F°0+)  that  P(Bn)  — »  1,  and  from  the  SLLN  that  F(C„)  — »  1.  In  view 

of  Theorem  2.1  in  Ayer  et  al.  (1955),  we  have,  on  the  event  B„  D  C„, 

F„(x-)  <  Fn(x)  <  F„(x)  <  F„(x)  <  F„(x+). 

TTiat  is,  F„(x)  =  F„(x).  Thus  the  desired  result  follows,  as  P(B„  O  C„)  -»  1.  This  proves 
the  claim  when  x_  and  x+  belong  to  A. 

If  x+  ^  ^  anc*  x~  then  x+  =  +oo,  since  x  is  a  regular  point.  Let 

K  =  {  Pn(x-)  <  F„(x)  <  1 }  and  C„+  =  >  0,  N„(x)  >  0}. 

2-’  and^Jr-)  *  F»W  *  1  ,ta'  -  '■  ““  the 
event  irrr'l  7  L  vlew '  of  Theorem  2.1  in  Ayer  a  al.  (1955).  we  have,  on  the 
,  ,  ."  Cn’  tbat  F»(x-)  <  Fn{x)  <  Fn(x)  <  F„( x).  That  is,  F„(x)  =  Fn(x )  Thus 

^belongs to  T  f0lI°WS’  3S  P(B:nC:)  ^LThis  Proves  the  claim  when  x.  but  not 

The  proof  when  x+  but  not  belongs  to  A  is  similar  and  will  be  omitted.  □ 

The  above  result  shows  that  Fn(x)  has  the  same  asymptotic  properties  as  F„(x).  Thus 
the  following  result  is  immediate. 

Theorem  3.2.  Let  x  be  a  regular  point.  Then 

Fn(x)  -  F0(x)  =  l  J2 -  -  F0(jt)}  +  op(n~b 


y=  l 


F°Mne-lZ \nlxf n!X!  ~/o(X)}  iS  asymPtotically  n°™al  with  mean  0  and  variance 
fo Wi/gw.  This  asymptotic  variance  can  be  consistently  estimated  by 

«(•*■){  1  -  F,,(x)}  /N„(x).  Also,  if  x  i  <■■■<  xm  are  regular  points,  then  n'i{Fn{x\)  — 

oU|  '’•••’ Fn{Xm)  ~  FM  is  asymptotically  normal  with  mean  vector  0  and  diagonal 
covariance  matrix.  "wgunai 

e  LetlkUS  "°W  address  effic'ency  considerations.  For  this  fix  a  regular  point  *.  It  follows 
from  the  above  theorem  that  F„(jt)  has  influence  function  i|>  given  by 

We  shall  now  show  that  ijj  is  the  efficient  influence  function  for  estimating  F0(x).  This 
wi  1  show  that  Fn(x)  is  a  least-dispersed  regular  estimator  of  Ffx).  The  reader  unfamiliar 
wi  these  concepts  should  consult  the  monograph  by  Bickel  et  al.  (1993).  Let  M  be 
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the  set  of  all  measurable  functions  such  that  J  h  dF0  =  0  and  J  h>  dF0  <  oo.  For  h  e  X 
define  a  sequence  Fnth  of  distribution  functions  by 


'-L 


Fn.h(t) 

where  h„  =  hl[2\h\  <  n 

n*{  F„,a(x)  -  F0(x)} 


( 1  +  Tl  2  dn)  dFo,  t  tE 


(-£»,(] 

jhl[2\h\  <  nh  dF0.  Then 


■X 


oo^rl 


/u/Fq. 


The  tangent  (or  score  function)  x„  associated  with  the  perturbed  distribution  F„,„  is  given 

by  (  b  i  —  a  \  _  g(n{A-F0(n} 

x^n  =  H(Y)  -  yzr^yj)  -  Fo(T){i  -  f0(Y)Y 

Finally,  it  is  easy  to  check  that  £  {xp(A,  Y)rh(A,  T)}  =  S,"“  ^y\=AI\Y°=  xl 

hetf  and  since  rp  is  a  tangent,  i.e.,  ip  =  for  some  h  €  with  (  )  1 

Fn(x){  i  _  F0(x)} /g(x),  we  obtain  that  ip  is  the  efficient  influence  function  if  G  is  kno  . 
However,  *  V  J£ V.  efficient  influence  function  if  G  is  unknown,  as  ,h.  iangenrs  fo, 
G  are  orthogonal  to  the  tangents  {xh  :  h  e  ?(  }  for  F0. 

4.  CONCLUDING  REMARKS 

The  main  results  of  our  paper  are  given  in  Theorems  2. 1  and  3.2.  Theorem  2. 1  gives  the 
strong  consistency  at  each  point  in  *,  while  Theorem  3.2  obtains  asymptotic  normality 
at  regular  points.  Thus  F„( x)  is  both  strongly  consistent  and  asymptotically  norma  y 
distributed  at  each  regular  point  x.  Typically,  consistency  fails  to  hold  for  Poin‘s 
increase  that  are  not  in  the  closure  of*.  Also,  the  asymptotic  normality  result  may  not 
hold  for  nonregular  points,  as  the  following  example  shows. 

Example  4.1.  Assume  that  *  consists  of  just  four  points,  namely  a,  <  *2  < ■  «3  <a*’ 
and  that  0  <  F(a,)  <  F(a2)  =  F(a3)  <  F(a4)  <  1.  On  the  event  A  =  { F„(a ,)  < 

F  (a  )  <  F„(a3)  <  F„(«4)}  we  have  £„(*,)  =  fctaM  -  V  ’4’  a?d  ^ 

«;  =  {?,(«')  <  F,<«)  <  F.(»4),F.(Ui)  <  F.ta)  £  F,(«).F.te)  >  F„(®)}  we  have 

F„(a,)  =  F„(a,)  for  i  =  1,  4  and  F„(a2)  =  Fn(a3>  -  F„,  where 


F„  = 


N-jai)  +  N„  (fl3) 
A„(a2)  +  A„(a3) 


It  follows  from  the  SLLN  that  P(A„  U  B„)  ->L  This  shows  tha  the  asymptot  c  d.^ 
tribution  of  -  F0(a2),  F„(a3)  -  fo(a3))T  »  the  same  as  ha  of  f 

Fn(a?)  F*(ai)  -  F0(a3))T,  where  (FJ(a2),FB*(a 3))  =  ( Fn(a2),  F„(a3))  if  F„(a2)  <  F„(a3), 
and  ^2)  =  F*(a3)  =  F„  if  F„(a2)  >  Fn(a3).  An  application  of  Slutsky’s  theorem 

yields  "that  the  asymptotic  distribution  of  Jn(F*(a2)  -  F0(a2),Fn(a3)  F0(a3))  is  e 

distribution  of  the  bivariate  random  vector  Z*  defined  by 


.  (z:\  ( Zi\u7  c  71 ,  gtejZz  +  gfajj 

z  =UrU),l!£  3’  «**>+««»> 


g(a2)Z2  +  g(a3)^  /  1  A  /[Zz  >  23], 


where  Z2  and  Z3  are  independent  normal  random  variables  with  zero  means  and  variances 
f?„'){7 -  Ffell/ste)  and  F«t,){l  -  F«*»/»te).  respectively.  One  can  check  .ha, 
the  distributions  of  Z*  and  Z\  are  not  normal. 
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The  corollaries  in  Section  2  address  uniform  strong  consistency  under  different  sets  of 
assumptions.  Corollary  2.3  implies  that  the  GMLE  is  uniformly  strongly  consistent  on 
!A.  if  F  is  continuous  and  A  is  closed.  Corollary  2.4  gives  uniform  consistency  on  A  if 
this  set  is  generated  by  an  increasing  sequence.  If  F  is  increasing  and  A  c  {x  e  R,  :  0  < 
F(x)  <  1},  then  the  assumptions  of  Corollary  2.4  imply  that  each  point  in  A  is  regular 
and  thus,  in  view  of  Theorem  3.2,  the  asymptotic  normality  at  each  point  in  A. 

Corollary  2.5  is  of  interest  from  a  theoretical  point  of  view  in  that  it  provides  conditions 
that  guarantee  the  uniform  strong  consistency  on  the  entire  line.  From  a  practical  point 
of  view  the  imposed  conditions  are  rather  unrealistic.  For  example,  if  F  is  the  uniform 
distribution  on  [0, 1],  then  A  has  to  contain  a  dense  subset  of  [0, 1],  But  distributions 
G  with  this  property  are  rarely  encountered  in  practice.  Note  also  that  the  assumptions 
of  Corollary  2.5  rule  out  the  existence  of  regular  points,  so  that  we  cannot  conclude  the 
asymptotic  normality  from  Theorem  3.2. 

It  is  an  open  question  whether  the  parametric  convergence  rate  holds  at  each  point  in 
A .  Since  one  can  show  that  Fn  has  parametric  convergence  rate  at  each  point  in  A,  we 
conjecture  that  the  GMLE  has  the  same  property  although  the  limit  might  not  be  normal 
as  Example  4. 1  shows. 

Groeneboom  and  Wellner  (1992)  showed  that  the  GMLE  is  uniformly  strongly  con¬ 
sistent  if  F0  and  G  are  continuous  and  PFo  <C  Pc.  The  latter  means  that  the  probability 
measure  PFo  induced  by  F0  is  absolutely  continuous  with  respect  to  the  probability  measure 
Pg  induced  by  G.  In  view  of  our  Corollary  2.5,  we  expect  the  uniform  strong  consistency 
also  if  F0  is  continuous  and  if  G  is  a  mixture  of  a  continuous  distribution  function  and  a 
discrete  distribution  function  which  satisfies  the  assumptions  in  Corollary  2.5. 

Groeneboom  and  Wellner  (1992)  showed  that  under  the  additional  assumption  that  Fq 
and  G  have  positive  derivatives  at  a  point  r0,  the  convergence  rate  of  F„(t0)  is  n?.  It  is 
an  open  question  whether  the  rate  is  still  valid  without  this  additional  assumption. 

Our  parametric  convergence  rate  n ?  in  Theorem  3.2  is  in  constrast  to  the  nonparametric 
convergence  rate  n J  under  their  assumptions.  Our  Theorem  3.2  is  trivially  true  under  the 
assumption  that  both  X  and  Y  take  on  the  same  finitely  many  values.  In  this  case, 
the  problem  reduces  to  the  estimation  of  the  parameters  of  a  multinomial  distribution 
function,  which  is  a  parametric  problem  giving  the  usual  convergence  rate.  This 
simple  fact  was  noticed  without  proof  by  Peto  (1973)  and  Turnbull  (1976)  as  they  both 
conjectured  (incorrectly)  that  the  GMLE  has  a  convergence  rate  n?  in  general.  We  have 
established  the  parametric  convergence  rate  of  the  GMLE  for  the  first  time  under  the 
assumption  that  X  and  Y  may  take  infinitely  many  values. 
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ASYMPTOTIC  VARIANCE  OF  THE  GMLE  OF  A  SURVIVAL 
FUNCTION  WITH  INTERVAL-CENSORED  DATA 
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SUMMARY.  Interval-censored  data  are  generated  by  a  random  survival  time  X  and  a 
random  censoring  interval.  We  either  observe  the  exact  survival  time  or  only  know  the  survival 
time  lies  within  the  censoring  interval.  Turnbull  (1976)  proposes  a  self-consistent  algorithm  for 
obtaining  the  generalized  maximum  likelihood  estimator  (GMLE)  of  a  survival  function  with 
interval-censored  data.  Yu,  Li  and  Wong  (1996)  prove  the  strong  consistency  of  the  GMLE.  In 
this  paper,  we  establish  the  asymptotic  normality  of  the  GMLE  and  self-consistent  estimators 
(SCE)  and  present  a  consistent  estimator  of  the  asymptotic  variance  of  the  GMLE  and  SCEs 
with  interval-censored  data. 


1.  Introduction 


We  consider  the  nonparametric  estimation  of  distribution  function  F  of  a 
survival  time  X  with  incomplete  observations  due  to  interval  censoring.  Interval- 
censored  (IC)  data  are  bivariate  observations  ( L{,Ri ),  i  =  where  < 

Ri.  If  L{  =  Ri,  then  a  survival  time  =  Li  =  Ri  is  observed  and  we  say  it  is 
an  exact  observation ;  if  L{  <  Ri,  then  X{  is  censored  and  a  censoring  interval 
is  observed  instead. 
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I  Abstract 

tdTrarrUeMyiitr is  ni  drt,y  observabie’ but  °niy  *«•»  «*> h- 

We  consider  the  large  sample  properties  of  the  general iVed  •  palr  of  observable  inspection  times  such  that  Y <Z. 
fcncta  ax  with  cse  2  £, ITS 71 ”  “  ““““  <0MLE>  »f  *»  distribution 

tfce  strong  consistency  of  the  GMLE  at  the  support  points  of  th7  u  d“'K  ™*“"  variables.  We  prove 

"  the  cate  of  o„ly  finite  many  s„pport  ““f  *!y"P'0,iC 

AMS  classification:  primary  62G05;  secondary  62G20 

—  Ai>"P10,1C  n0m,,lily;  C°^  —  —1  Self-consistent  „gort,ta 


1.  Introduction 

In  many  biomedical  studies, 
to  lie  before  an  inspection  time 
time  Z.  This  censoring  scheme 
studies  (Finkelstein  and  Wolfe 
1992).  We  assume  throughout 
denote  the  distribution  function 


the  random  survival  ttae  X  of  interest  is  never  observed  and  is  only  known 

is'refeired'to'as  '“^7“'“”  ““p  *  ““  Z  °'  aft“  ,ha  ‘•I**- 
,  1985)  and  Ams  l^  D  censonng.  Examples  can  be  found  in  cancer 

that  1  and  r^Z)  J  (B?“r  “d  Melbye’  199I=  Aragon  and  Eberly 
of /by  F  anf  th  and  that  Y  <z  with  probability  one.  We 

y  o  and  the  joint  distribution  function  of  ( Y,Z )  by  G.  The  available 
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data  for  the  case  2  interval-censorship  model  are  thus 
(Yj,Zj,I[Xj  ^  Yj\,I[Yj  <Xj  ^Zj]\  j=l,...,n, 

Wr rL(XL’ Yl’Zl),V Yn’ Zn >  are  dependent  copies  of  (X,Y,Z)  and  I[A]  is  the  indicator  of  the  set  A. 
i  .  n,e  oonj.an  e  Iner  (1992)  considered  the  case  2  interval-censorship  model  with  continuous  F0  and 

^  Pr0P“e<1  “  i,era,lve  Convex  calculate  the  GMLE  Z 

the  first  \  7  Str0nS  cons,stency  of  the  OMLE  They  showed  that  the  estimator  of  F0  obtained  at 
first  step  of  the  iteranve  convex  minorant  algorithm  converges  to  F,  at  the  (ttlogtt)1'1  rate  and  that  its 

MhCTPan  d‘fnb“,'°n  !s  no*  non”al-  The  asymptotic  distribution  of  the  GMLE  remains  unresolved.  There  are 

(1976)  selficonsUntXrtta^'  Th'y  '”dU<le  P''°'S  ^  NeM0"-RaPhs°"  algorithm  Tambulfs 

auls'S:  a^d  ^  ™S  “  -  *  — 

■z?={aeU:  P(Y  =  a)  +  P(Z  =  a)> 0} 

m  I/6  Fronfthis  we^can  f°f  u  °F  Z,We  establish  the  stronS  consistency  of  the  GMLE  at  each  point 

is  denseTn  ro  ^I  Thk  '  r  o  T  Str°ng  consistency  of  the  GMLE  if  F0  is  continuous  and  * 
is  aense  in  |U,oo).  rhis  is  done  in  Section  2. 

tJ-^T 3  re/°Tder  -he  CaSC  °f  finite  We  obtain  the  J°int  asymptotic  normality  of  the  GMLE  at 
the  usual  VS  rate  for  the  points  in  *  and  present  a  consistent  estimator  of  its  asymptotic  variance 

2.  The  consistency  of  the  GMLE 

*°-b)=P(r  =  °-  Z  =  6»°-  «*.  *  ‘a  *•  - 

N’ia- b)  ”  ;  £ ‘W <«,  ?  -  ■ a,  z,  =  »1 

7= 1 
1  n 

Nn(a’ b)  =  -  E *1° <Xj < b,  Yj  =  a, Zj  =  b], 

7= l 

N„+  (a,  b)  =  I  £ 1[Xj  >  b,  Yj  =  a,  Zj  =  b], 

7=1 

Nn(a,b)=lj2l[Yj  =  a,Zj  =  b]. 

7=1 

Then  the  generalized  likelihood  is  given  by 

A»(F)  =  F{a)nN»~(a'h\F(b)  -  F(a)]nAftfl'*>[l  _  F{b)]nN»{a'b) 

(a,b)e& 

and  the  normalized  generalized  log-likelihood  is 

£en(F)  =  {N~ ( ab )  log[F(a)]  +  N°(a,  b)  log[F(6)  -  F{a)}  +  N+(a,  b) log[l  -  F(A)]}. 

(ci>b)  £  3d 
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Here  and  below  we  interpret  0  log  0  =  0  and  log  0  =  —  oo.  In  the  above  we  let  F  range  over  the  set  8F  of  all 
subdistribution  functions.  A  function  F\  is  called  a  subdistribution  function  if  F\  ~aF  for  some  distribution 
function  F  and  a  number  aE[  0, 1].  Note  that  An(F)  and  &n{F)  depend  on  F  only  through  the  values  of  F 
at  the  points  a  Ed.  Thus  there  exists  no  unique  maximizer  of  A „(F)  over  the  set  J*7*.  However,  there  exists 
a  unique  maximizer  Fn  of  An(F)  over  the  set  which  satisfies  Fn(x)  =  sup{F ,(0):  =  «]  + 

I[Zj  =  a])> 0}  for  all  xEU.  Here  we  interpret  the  supremum  of  the  empty  set  as  0.  We  call  Fn  the  GMLE 
of  F0. 


Theorem  2.1.  The  GMLE  Fn  satisfies  Fn(a)^F0(a)  almost  surely  for  all  a  Ed. 
Proof.  Verify  that 

L{F)  -E(<en{F))=  9(a,b)ha,b(F) 

fab)  €# 


with 


ha,b(F)  =  Fo(a) log[F(a)]  +  [F0(6)  -  F0(a)]  log[F(i)  -  F(a)]  +  [1  -F0(i)]log[l  -F(b)]. 

It  is  easy  to  check  that  the  expression  ha>b(F)  is  maximized  by  a  nondecreasing  function  into  [0,1]  F  if  and 
only  if  F(a)  =  F0(a)  and  F(b)  =  F0(b).  Thus,  F0  maximizes  L(F)  and  any  other  nondecreasing  function  into 
[0,1]  that  maximizes  L(F )  coincides  with  F0  at  the  points  in  sd . 

Note  that  J£?„(F0)  =  ±  £"=1  WXj>  Yj,Zj),  where  'A  is  the  maP  defined  by 

«K*,  y,z)-I[x^  y ]  log(F0(y ))  +  I[y  <x <z]  log(F0(z)  -  F0(y ))  +  I[z <x]  log(  1  -  F0(z)). 

Thus,  it  follows  from  the  SLLN  that  J 5f„(F0)  ^Z,(F0)  almost  surely.  By  the  definition  of  the  GMLE, 
if„(F„  )^i?„(F0).  Consequently, 

liminf  £’n(F„)>  liminf  i?„(F0)  =  Z.(F0)  almost  surely. 

n  — ►  oo  n  — *>  oo 

Let  Q'  denote  the  event  on  which  liminfn_00  ifn(F„)>L(Fo)  and,  for  each  (a,b)  6  88,  Nn  ( a,b)—*Fo(a ) 
g (a, b), sup, N~{a, b)=0  if  F0(a)=0,  N°(a,b)^(F0(b)  -  F0(a))g(a,b),swpnNj}(a,b)=0  if  F0(b)==F0(a), 
N+(a,b)->(  1  -F0(b))g(a,b)  and  sup, N+(a,b)=0  if  F0(6)=l.  Fix  an  coe  Q'.  Let  the  function  F  be  a 
limit  point  of  F„(-,co)  in  the  sense  that  F^(u,  co)  — *  F*(a)  for  all  a£  s/  and  for  some  sequence  {kn}  of 
positive  integers  tending  to  infinity.  We  now  show  that 

L(F*)>I(F0). 

Let  xkn{a,b)  denote  the  value  of  the  random  variable 

N£(a,  b) log (F*»)  +  NkSa> b )  log(F*„(b)  -  Ffc,(fl))  +  NkSa> b )  loS( 1  “ 
at  the  point  co.  Thus,  by  the  definition  of  Q1, 
liminf  V'  xkrt(a,  b)^L(F0) 

n  — ►  oo  z — J 


and 


xkn(a,  b)  -►  g(a,  b)ha<b(F* ) 
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for  each  (a,  b)  6  3S.  Note  also  that  xK(a,  b)^0  for  all  (a,  b)  G  SA.  Thus  an  application  of  Fatou’s  Lemma  yields 
lim sup  Y  xkn{a,b)~-  liminf  Y  -xk(a,b) 

— ►  oo  ,  77”  ^  n  — ►  oo  '  n 


0 a,b )  €  38 


(a,b)e38 


Y  liminf(-x*„(a,b)) 


Me  l 


=  Y  y(a’  b)ha,b(F*) 

{a,b)eSS 

=  L(F* ). 

Combining  die  above  yields  L(F0)^L(F*).  As  F0  maximizes  L,  we  can  conclude  that  L(F*)=L(F0)  and 

t  erefore  F  (a)  =  F0(a)  for  all  Since  co  was  arbitrary  and  Q'  has  probability  one,  we  can  infer  the 

desired  result.  □ 


**  1S,  3  finite  set’  then  11  follows  from  the  theorem  that  the  GMLE  is  uniformly  strongly  consistent  on 
■y  ^or /yitrary  si,  the  uniform  strong  consistency  of  the  GMLE  requires  additional  assumptions.  The  proofs 
ot  the  following  corollary  and  theorem  are  similar  to  Yu  et  al.  (1998)  and  are  thus  omitted  here. 

Corollary  2.2.  Suppose  that  si  is  a  closed  set.  Assume  that  F0(a-)  =  F0(a)  for  every  a  e  si  for  which  there 
is  a  sequence  of  points  C  si  such  that  a,  |  a.  Then  the  GMLE  is  uniformly  strongly  consistent  on 

si,  i.e.,  supae^  |F.(a)-/%(a)|-K>  almost  surely. 


We  call  a  number  x  a  point  of  increase  of  F0  if  either  F0(x)<F0(y)  for  all  y>x  or  F0(y)<F0(x)  for  all 
y  <ix% 

Theorem  2.3.  5'w^o.?e  that  F0  is  continuous  and  the  closure  of  si  contains  the  set  of  all  points  of  increase 
oj  /-o.  I  hen  the  GMLE  is  uniformly  strongly  consistent,  i.e.,  supx€R  |£(*)  -F0(x)|^0  almost  surely. 

3.  The  asymptotic  normality  of  the  GMLE 

In  this  section  we  shall  obtain  the  asymptotic  normality  of  the  GMLE  under  the  assumption  that  si  contains 
finitely  many  elements  and 

0<'fb(a)<Fo(b)<l  for  all  a,b  in  si  such  that  a<b. 

Note  that  under  the  current  assumption  the  standard  method  for  finite  parametric  models  can  be  used. 

et  J'  denote  the  set  of  all  distribution  functions  F  which  satisfy  0<F(a)<F(b)<l  for  all  a,  b  in  si  with 
a  <b.  f<or  F  G  F  and  a  6  st,  let 

&n,a(F)  =  Y 

b:  (a,b)€38 


( Nn  (a,b) 

Y(a,b)  > 

1+  E  1 

c:  ( c,a)£  38 

(  K(a,b) 

K(c,a)\ 

V  F(a) 

F(b)  —  F{a)  j 

\F{a)  —  F(c) 

1  -F(a)J 

&n,a,a{F)  =-  T  ( !L_(a,b)  N°(a,b) 

Y- — <  \  Z7 2( '  /Z7/L\  TPS  . 


*  (**)€*'  (F(b)  —  F(a))2 

-  Y  (  N"(a’b)  I  Nn+(c,a)  \ 

c:  Me ®  \ ^  -  F^2  ( 1  ~F{a)f) 


/ 
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and 


^n,a,b(F)  —  £Fn,b,a(F)  = 


C F{b)-F{a)T 


Then 


a,b£stf,a<b. 


#nta{F) 


8^njF) 

dF(a) 


and  3’„,aAF)  =  £’nAa(F)  = 


&&n{F) 

3F(a)dF(b)’ 


a,b€sf. 


Let  ax  <a2<  ■  ••  <am  denote  the  elements  of  s/.  For  Fe  f,  let  if „(F)  denote  the  /w-dimensional  column 
vector  with  entries  (^’n(F))i  =  i  —  1 . m,  and  £?„(F)  denote  the  m  x  m  matrix  with  entries 

(J£n(F))ij  =  £?nahaj(F)9  ij  =  1,. . .  9m. 


Finally,  set 


J=-E[^n(Fo)(^n(F0))r]^  -E[&n(Fo)]. 
The  matrix  J  is  positive  definite  since 


,D+  £ 


9(ahaj) 


1  <  i  <j  <  m 


F0(aj)  -  F0(a.) 


(ei 


ej)(ei  ~  ej)1 


where  D  is  the  diagonal  matrix  with  positive  diagonal  elements 


d  _^{r  =  a/}  P{Z  =  aj} 

Fo(ai)  1  -F0(a,)’ 


and  eu...,em  denote  the  standard  basis  in  Rm.  It  is  easy  to  verify  that 
&n(Fn)-+E[&n(F0)]=  -J. 

It  thus  follows  that  on  the  event  {Fn  e  J*  ]■ 

0  =  SPn{Fn )  =  if n(Fo )  -  JAn  +  0p( ||  An || ), 


\ 


t"  u  th1e/2W'dimenS,onal  column  vector  with  entries  Fniat)  -  F0{ai),  i  =  It  follows  from 

the  (_LT  that  n  £fn(F0)  is  asymptotically  normal  with  mean  0  and  dispersion  matrix  J.  This  shows  that 
An  —  J  3? n(Fb)  -f.  0p(n  1/2).  Thus,  we  have  the  following  result. 


Theorem  3.1.  Suppose  F0  belongs  to  IF.  Then 

(  Fnlflx)  -  FoOi)  \ 

\Fn(am)  —  Fo(am)  / 

is  asymptotically  normal  with  mean  0  and  dispersion  matrix  J~l.  A  strongly  consistent  estimator  of  J  is 
given  by  J 
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We  consider  the  problem  of  estimation  of  a  joint  distribution  function  of  a  multi¬ 
variate  random  vector  with  interval-censored  data.  The  generalized  maximum 
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1.  INTRODUCTION 

We  consider  the  estimation  of  a  joint  distribution  function  F0  of  a  multi¬ 
variate  random  vector  X  =  (Xx , Xd )  which  is  subject  to  interval  censoring. 
In  interval  censoring,  the  value  of  each  coordinate  variable  X{  may  not  be 
directly  observable;  instead,  a  pair  of  extended  real  numbers  L,  and  Rt  such 
that  Li^X^Ri  are  always  observed.  The  observations  Lt  and  Rt  satisfy 
one  of  the  following  four  conditions:  Li  —  R{  (exact),  0  =  L,<i^  (left  cen¬ 
sored),  Li<Ri=  oo  (right  censored),  and  0<L,<R,<  oo  (strictly  interval 
censored).  A  ^-dimensional  interval-censored  observation  corresponding  to 
X  is  represented  by  the  2t/-dimensional  vector  (Lj,  R\, Ld,  Rd). 

Multivariate  interval-censored  data  arise  in  a  variety  of  life  testing 
situations  and  biomedical  studies.  We  describe  a  clinical  study  in  the 
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following  example  that  gives  rise  to  bivariate  (d—  2)  interval-censored 
data. 

Example  1.1  (The  Italian- American  Cataract  Study  Group  (1994)). 
A  total  of  1399  persons,  between  45  of  79  years  of  age,  who  had  been 
identified  in  a  clinic-based  case  control  study  were  enrolled  in  a  follow-up 
study  between  1985  and  1988.  The  follow-up  study  was  designed  to  estimate 
the  rate  of  incidence  and  progression  of  cortical,  nuclear,  and  posterior  sub- 
capsular  cataracts  and  to  evaluate  the  usefulness  of  the  Lens  Opacities 
Classification  System  II  in  a  longitudinal  study.  Beginning  in  1989,  follow¬ 
up  lens  photographs  were  taken  and  graded  at  a  six-month  interval. 
Patients  might  skip  some  visits.  Data  were  obtained  from  Zeiss  slit-lamp 
and  Neitz  retroillumination  lens  photographs  at  each  patient’s  visit.  The 
exact  time  that  the  event  of  interest  occurred  was  only  known  to  lie  within 
the  period  between  two  consecutive  visits,  or  was  right  censored  if  by  the 
end  of  the  study  the  event  still  had  not  taken  place.  Consequently,  bivariate 
interval-censored  data  were  encountered. 

At  present,  nonparametric  estimation  of  a  joint  distribution  function 
with  multivariate  interval-censored  data  has  not  been  considered.  A  current 
practice  is  to  take  the  midpoint  of  the  interval  (L,  R)  as  an  exact  observa¬ 
tion  unless  it  is  right  censored.  Then  Dabrowska’s  (1988)  Kaplan-Meier 
estimator  on  the  plane  or  van  der  Laan’s  (1996)  repaired  generalized  maxi¬ 
mum  likelihood  estimator  can  be  applied  to  such  data.  Another  practice  is 
to  treat  the  right  endpoints  of  the  interval-censored  data  as  exact  observa¬ 
tions  unless  they  are  right  censored  (see  Samuelsen  and  Kongerud  (1994)). 
However,  these  two  practices  will  introduce  bias  in  the  analysis  (Samuelsen 
and  Kongerud  (1994)). 

Multivariate  right-censored  data  are  special  cases  of  multivariate  interval- 
censored  data.  References  for  nonparametric  estimation  of  distribution 
functions  with  multivariate  right-censored  data  can  be  found  in  Campbell 
(1981),  Hanley  and  Pames  (1983),  Tsai  et  al  (1986),  Dabrowska  (1988), 
Gill  (1992),  Prentice  and  Cai  (1992),  Lin  and  Ying  (1993),  and  van  der 
Laan  (1996),  etc. 

Nonparametric  estimation  of  a  distribution  function  with  univariate 
interval-censored  data  has  been  studied  by  Peto  (1973),  Turnbull  (1976), 
Tsai  and  Crowley  (1985),  Chang  and  Yang  (1987),  Groeneboom  and 
Wellner  (1992),  Gu  and  Zhang  (1993),  and  Yu  et  al  (1996  and  1998), 
among  others. 

In  Section  2,  we  discuss  generalized  maximum  likelihood  estimation  of 
F0  based  on  multivariate  interval-censored  data  and  formulate  the  case  2 
multivariate  interval  censorship  model.  We  establish  consistency  of  the 
generalized  maximum  likelihood  estimate  (GMLE)  of  F0  in  Section  3  and 
asymptotic  normality  of  the  GMLE  in  Section  4. 
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2.  METHOD  OF  ESTIMATION 

Let  X~(Xl9 ...,  Xd)  be  a  d-dimensional  random  survival  vector  with  a 
joint  distribution  function  F0(x),  where  x  =  (x^, xd).  The  observable 
random  vector  is  (Ll,  Rl9  Ld ,  Rd),  where  L^Rj  for  all  i.  Suppose  that 

(^iij  R  II  9  •••?  ••*>  (^-'nls  Rft  1 5  •••>  Rrui) 

are  i.i.d.  copies  of  (Lu  Ru  ...,  Ld9  Rd).  We  want  to  estimate  the  joint 
distribution  function  F0(x)  (or  the  survival  function  S'o(x)  = 
P{Xx  >xl9 ...,  Xd>xd) ).  Each  univariate  interval-censored  data  {Lip  Rj) 
can  be  viewed  as  an  interval  Iij9  where 

t  if  Lij=  Rfj, 

J  ^(/]  ^  Lij<Ri/, 

therefore,  each  multivariate  interval-censored  observation  can  be  viewed  as 
a  rectangular  set  —  In  x  •  •  •  x  Iu,  i  —  1, n . 

Define  a  maximal  intersection  (MI),  A,  with  respect  to  the  J] s  to  be  a 
nonempty  finite  intersection  of  the  such  that  for  each  i  AnSt=0 
or  A.  For  example,  let  =  (0,  2]  x  (1,  3],  ^  =  (0, 4]  x  (1,  5],  ^3  = 

(3,  5]  x  (4,  8],  and  =  (3,  5]  x  (4,  8].  Then  the  possible  Mi’s  are 

(0,  2]  x  (1,  3]  and  (3,  4]  x  (4,  5].  Let  { Ax , ...,  Am }  be  the  collection  of  all 
possible  distinct  Mi’s. 

Using  an  argument  similar  to  Hanley  and  Pames  (1983),  it  can  be 
shown  that  the  GMLE  of  F0(x)  which  maximizes  the  generalized  likelihood 
function,  An,  must  assign  all  the  probability  masses  su...,sm  to  the  sets 
Al9 Am.  Thus  the  generalized  likelihood  function  is  as  follows: 

A„=  fl  Mf(S,)=Y\  [  E  HAjcJlsj],  (2.1) 

/-i L  j-i 

where  fiF  is  the  measure  induced  by  a  distribution  function  F,  1(  • )  is  the 
indicator  function,  s  (  =(jj,  jm-1),)6Z)„  sm=l—  sx  —  •  ••  —  sm„l9  s'  is 

the  transpose  of  the  vector  s,  and  Ds—  (s;  st^0, +sm_x  ^  1}. 
Denote  the  GMLE  of  s  by  s  and  that  of  F0  by  Fn. 

The  Sj  s  can  be  obtained  by  the  self-consistent  algorithm  described  by 
Turnbull  (1976)  for  univariate  interval-censored  data  as  follows:  Let 
s^—X/m  for  y=l,  ...,/m.  Denote  5^  —  l(Aj  c^).  At  the  h-step,  sf*  — 
Z"-i  ( 1/n)  {^tjS{jh~X) 1, m,  h  ^  1.  Repeat  until  the  sj s 
converge.  The  justification  of  the  convergence  of  this  method  for  multi¬ 
variate  interval-censored  data  is  similar  to  that  given  in  Turnbull  (1976)  for 
univariate  data. 
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Given  a  GMLE  s,  the  GMLE  of  F0(x)  is  not  uniquely  defined  on  an  MI 
unless  the  MI  is  a  singleton.  A  GMLE  of  F0(x)  can  be  obtained  as  follows: 

£,(*)=  £  sr  (2-2) 

Aj<=[0,  x,]  X  x[0,xrf] 

Remark  1.  The  GMLE  of  s  may  not  be  unique,  as  the  following 
example  demonstrates. 

Suppose  that  a  sample  of  size  4  consists  of  two-dimensional  interval- 
censored  observations  (1,6,  1,3),  (1,6,4,  6),  (1,3,  1,6)  and  (4,  6,  1,6). 
Then  the  Mis  are  Ax  =  (1,  3]  x  (1,  3],  A2  =  (1,  3]  x  (4,  6],  A3  = 
(4,  6]  x  (1,3]  and  A4  =  ( 4,6]x(4,6].  (sl9  s2}  s3,  s4)  =  r(  1/2,  0,  0,  1/2)  + 
(1  — r)(0,  1/2,  1/2,0)  is  a  GMLE  of  s,  for  all  re[0,  1].  Thus  there  are 
infinitely  many  expressions  for  GMLE.  However,  yU/(^)  =  1/4,  i—  1, ...,  4, 
for  all  re  [0,  1]. 

In  general,  s  may  not  be  consistent  under  discrete  assumptions. 
However,  the  consistency  of  Fn  on  a  certain  set  will  not  be  affected  (for 
more  details,  see  Section  3). 

The  derivation  of  the  GMLE  only  requires  that  the  observations 

, ...,  Jn  are  i.i.d.  To  derive  the  asymptotic  properties  of  the  GMLE,  we 
need  further  assumptions  on  F0  and  the  distribution  function  of 
C^l  >  *13  —3  Rd ). 

A  set  of  univariate  interval-censored  data  are  referred  to  as  case  2  data 
if  they  consist  of  strictly  interval-censored,  right-censored  or  left-censored 
observations,  but  do  not  contain  exact  observations.  For  such  type  of  data, 
Groeneboom  and  Wellner  (1992)  formulate  the  case  2  univariate  interval 
censorship  model.  We  consider  a  natural  multivariate  extension  of  the 
case  2  univariate  interval  censorship  model  in  the  following. 

Suppose  ( Uu  Vl9 ...,  Udi  Vd)  is  a  random  censoring  vector  and  is 
independent  of  X.  The  observable  random  vector  (Lu  Ru Ld>  Rd) 
is  generated  by  the  following  formula. 


r (o,  ut)  if  xt<ui9 

(Li9  Ri)  —  <(£/„  Vt)  if  Ut<Xt<Vt9  1, 

l(Vi9  +O0)  if  Xt>Vi9 

We  call  this  model  a  case  2  multivariate  interval  censorship  model  (C2M 
model).  In  the  next  two  sections,  we  shall  discuss  the  asymptotic  properties 
of  the  GMLE  under  the  C2M  model.  For  ease  of  presentation  and  without 
loss  of  generality  (WLOG),  we  assumed  d=  2  hereafter. 
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3.  CONSISTENCY  OF  GMLE 

In  this  section,  we  make  the  following  assumptions  under  the  C2M 
model: 


The  censoring  vector  (U,  V)  is  discrete.  (3.1 ) 


Let  a  =  (fl„a2),  b  =  (bl,b2),  U  =  (£/,,  U2)  and  V  =  (K„  V2).  Define 

39  =  {(a,  b):  g(a,  b)  >  0} ,  where  g(a,  b)  =  F(U  =  a,  V  =  b), 

Note  that  each  point  in  39  induces  a  grid  of  nine  cells  in  R2.  Let 

—  { (X] ,  .x2) :  ^  t  G  i  co } ,  1)  (a,  b)  £  39} 

be  the  set  of  all  such  grid  points.  We  shall  establish  the  strong  consistency 
of  the  GMLE  at  each  point  in  From  this  we  can  infer  the  uniform 
strong  consistency  of  the  GMLE  if  F0  is  continuous  and  is  dense  in 
[0,  00 )2. 

Let  (X„  U„  V(),  i  =  l, ...,  n  be  i.i.d.  copies  of  (X,  U,  V).  For  (a,  b)  e  39,  let 


/n(a,  b)  —  (  00,  Uj]  x  (  00, n23,  •••* 


/2i(a,  b)  =  (ax ,  bi  ]  x  ( -  00,  a2]. 


/31(a, b)  =  (bi,  +oo)x(-oo, a2],  /33(a, b)  =  (61}  +  oo)x(b2,  +00). 


Let  sd  be  the  set  of  all  vertexes  of  Bx, ...,  Bh,  where  Bu ...,  Bh  are  all 
possible  Mis  with  respect  to  Iv( a,  b),  i,  j=  1,  2,  3,  and  (a,  b )e39.  Note  that 
is  the  set  of  vertexes  of  the  rectangles  4(a,  b)s.  Thus  in 

general.  Let 


AU(a,b)  =  -  t  l(Xy  G^«t(a,  b),  Uy  =  a,  V,=  b),  i,k  =  1,2,3. 
n  j- 1 

Then  the  generalized  likelihood  (2.1)  is  equal  to 

a„(d=  n  n 

(«,b)  <-l  y-1 


where 


Rr((c,  d]  x  (e,  /] )  =  F(d,  f)  +  F(c,  e)  - F(c,  f) -F{d,  e).  (3.2) 
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Moreover,  the  normalized  generalized  log-likelihood  function  is 
W)  =  I  £  £  ^(a,b)ln[/zF(//y(a,b))]. 

(*,b)e#  /-I  j-l 

Here  and  below  we  interpret  0 log 0  =  0  and  log0=— oo.  For  this 
likelihood  function,  we  let  F  range  over  the  set  J**  of  all  functions  F  on 
[  —  oo,  +oo]2  such  that 


F(  +  oo,  +  oo)  =  1,  (3.3) 

F(  —  oo,  x)  =  F(x,  ~  oo)  =  0  for  each  x,  (3.4) 

and 

0  for  all  rectangle  sets  /  in  (  —  oo,  -foo]2.  (3.5) 

In  view  of  (3.2),  An(F)  and  SFn(F)  depend  on  Fonly  through  the  values  of 
F  at  the  points  xg^.  Because  the  GMLE  of  F0  is  not  unique,  we  adopt 
expression  (2.2)  for  the  GMLE  in  our  proofs  below. 

Theorem  1.  Under  Assumption  (3.1),  the  GMLE  Fn  satisfies  Fn{ a)->> 
F0(a)  almost  surely  for  all  ae^,. 

Proof.  Verify  that 

L  (F):=E(^n(F))^  £  g(a,b)h^b(F)  (3.6) 

(m,  b)e& 

with 


Kb(F)  =  £  £  Prfliji a,  b))  In My(a,  b))]. 

/-i  y-i 

Verify  that  the  expression  h ^  h(F)  is  maximized  by  a  function  Fe&*  if  and 
only  if 


PriliA a,  b))  =^0(4(a,  b)),  z,  y  =  1,  2,  3.  (3.7) 

Equations  (3.2)  and  (3.4)  imply  that  (3.7)  is  equivalent  to  F(x)  =  F0(x)  for 
each  vertex  x  of  rectangles  /^(a,  b),  i,  j—  1,  2,  3.  Thus  F0  maximizes  L(F) 
and  any  other  function  in  that  maximizes  L(F)  will  coincide  with  F0 
on  stf*. 

Note  that  &n(F0)  =  (1  /n)  <A(Xy,  U,,  V,),  where  xj/  is  the  map  defined 

by 

^(x,  a,  b)  =  X  Z  J(x  e  4(a>  b))  lnGM4(a’  b)))- 

<-i  j-\ 
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Thus  it  follows  from  the  SLLN  and  (3.2)  that  SFn{F0)  -*  L(F0)  almost 
surely.  By  the  definition  of  the  GMLE,  SC„{F„)  ^  &„{F0).  Consequently, 

lim  £fn{Fn)  ^  lim  SC„(F0)  =  L(F0)  almost  surely. 

n—*  co  n—*  co 

Let  Q'  denote  the  event  on  which  Umn_>00  JF„(F„)  ^L(F0).  Fix  an  coeQ', 
let  F*6ir*  be  a  limit  point  of  F^(  •,  &>)  in  the  sense  that  FkJ) a,  co)  -> F*(a) 
for  all  aesd*  and  for  some  sequence  {£„}  of  positive  integers  tending  to 
infinity.  We  now  show  that 


L(F*)>L(F0). 

Let  tk(, a,  b)  denote  the  value  of  the  random  variable  1  Nknij(a’  b)  x 

ln[^/  (/„)]  at  the  point  co.  By  the  definition  of  £2', 

*n 

lim  £  tkn(a,  b)  ^  L(F0). 

n“*°°  (a,b)e^ 

Next,  verify  that 

<*„( a,  b)^g(a,b)h^h(F*) 

for  each  (a,  b )e88.  Note  also  that  tkj)a,  b)<0  for  all  (a,  b )e39.  From 
Fatou’s  Lemma, 

Em  £  tkJi a,  b)  =  -  lim  £  -  tkm( a,  b) 

£  lim  (-'*„(«,  b)) 

MM"’*"0 

=  £  £(*>  b)  h%  fc(F*) 

(«.  b)e« 

=  L(F*). 

Combining  the  above  yields  L(F0)<L(F*).  As  F0  maximizes  L,  we  con¬ 
clude  that  L(F*)  =  L(F0)  and  therefore  F*(a)  =  F0(a)  for  all  ae^.  Since 
co  is  arbitrary  and  Q'  has  probabihty  one,  the  consistency  result  is  thus 
established.  | 

If  sd*  is  a  finite  set,  then  it  follows  from  the  theorem  that  the  GMLE  is 
uniformly  strongly  consistent  on  .t/+.  For  arbitrary  sd*,  the  uniform  strong 
consistency  of  the  GMLE  requires  additional  assumptions. 

Theorem  2.  Suppose  that  (3.1)  holds,  F0  is  continuous  and  $d*  is  dense 
in  [0,  +oo)2.  Then  supxeJ*2  |F„(x)  —  F0(x)|  -»0  almost  surely. 
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Proof.  Let  FUF2, ...  be  functions  in  such  that  F„( a)  -*• F0(a)  for  all 
ae  j/*.  Let  M  be  a  positive  integer.  Since  F0  is  continuous,  there  is  a  grid 
which  partitions  the  space  ( —  oo,  +oo]2  into  M  disjoint  rectangles 

7=  ( c ,  d~\  x  (e,  /]  with  grid  points  (upper-right  vertexes  of  7s)  x, . xM  in 

(_oo,  +oo] 2  and  fiFo(I)  ^  1/M  for  each  grid  cell  7.  The  continuity  of  F0 
and  the  fact  that  sd°  is  dense  in  [0,  +oo)2  imply  that  there  are  points 
a„ ...,  aM  in  <  such  that  \F0(»t)  - F0(xt)\  <  1/M2.  Using  this  and  the  facts 
F0,  F„e&r*  and  that  F0(c,  e)  <  F0(x)  F0(d,  f)  and  F„(c,  e)  s$F„(x)s£ 

F„{d,  f)  for  each  x  e  7,  we  derive  that 

|F„(x)-Fo(x)|<  max  |F„(a,)  -F0(a,)|  +  -^,  xe^2. 

This  shows  that  F„  converges  to  F0  uniformly. 

By  the  above,  the  events  {^n(a)  ^o(a}  and  {supxeiS2 

|F„(x)  — F0(x)| -»0}  are  identical  and  thus  have  probability  1  by 
Theorem  1.  | 

Remark  2.  In  the  case  of  the  bivariate  right  censorship  model,  under 
the  assumptions  in  Theorem  2,  it  is  well  known  that  the  GMLE  is  not  a 
consistent  estimate  of  a  continuous  F0  (see  Tsai  et  al.  (1986)). 


4.  ASYMPTOTIC  NORMALITY  OF  GMLE 

Under  the  univariate  case  2  interval  censorship  model,  Groeneboom  and 
Wellner  (1992)  conjecture  that  if  the  censoring  distribution  is  continuous, 
then  the  GMLE  of  a  continuous  F0  is  not  asymptotically  normally  dis¬ 
tributed  and  the  convergence  rate  is  not  in  y/n.  Yu  et  al.  (1998)  prove  that 
if  the  censoring  vector  takes  on  finitely  many  values,  then  under  an  addi¬ 
tional  assumption  the  GMLE  is  asymptotically  normally  distributed  and 
the  convergence  rate  is  in  Jn.  In  the  multivariate  case,  the  situation  is 
more  complicated.  In  this  section  we  shall  obtain  the  asymptotic  normality 
of  the  GMLE  under  the  C2M  model  and  the  assumptions  that 

sd*  contains  finitely  many  elements,  (4.1) 

Ff0((°i>  ^i]  x (a2>  ^2]) >0  ^  and  1  =  1,2.  (4.2) 

and 


—  (see  Section  3). 


(4.3) 


MULTIVARIATE  SURVIVAL  ANALYSIS 


163 


Note  that  under  the  current  assumptions  the  standard  method  for  finite 
parametric  models  can  be  used. 

Remark  3.  The  GMLE  of  s  may  not  be  unique  (see  Remark  1)  and 
Theorem  1  does  not  ensure  the  consistency  of  the  GMLE  s  as  si  and  si^ 
are  not  the  same  in  general.  Note  that  the  consistency  of  the  GMLE  Fn  on 
si*  is  mainly  due  to  Eq.  (3.7),  since  si*  is  the  set  of  all  vertexes  of  the 
rectangles  ItJ( a,  b)’s. 

By  Theorem  1  and  (4.3),  the  GMLE  F„  is  consistent  on  the  set  si.  Since 
§j=fip{Aj),  where  the  vertexes  of  the  MI  Ap  belong  to  si,  s  is  consistent 
by  (3.2). 

Let  Sj=fiFo(Aj).  Then  (4.2)  yields  s°>  0  for  all  j.  Verify  that  (3.6)  yields 

UF)=  £  g(a,b)£  £  £*21(4*  <=/«(«,  b)) 

/—l/— 1  k 

xln  £s,l(yl,e/tf(a, )) 

j 

=  £  £  £  f(a,  b)£41(^*<=4(a,  b)) 

(a,b)e«  /-l  /-I  L  k 

xln  £ 5,1(4 <=/*(«, b))-  (4-4) 

j 

Let 

{Iu ...,  Ip)  =  {/„(a,  b):  i,  j=  1, 2,  3,  (a,  b)e^}, 

and 

P„=g( a,  b)£41(^c/a(a,b)). 

k 

We  can  rewrite  (4.4)  as 

L(F)=  £  In  £  Sjl(Aj<=Ih)=  £  phln  £  Sj8hj. 

A- 1  j- 1  A-l  j- 1 

From  (4.2),  ph>0,  h  =  l, ...,  p.  Set  J=  -E(82SP(F0)/8s8s'),  where  8SF/8s 
is  an  (m  —  l)xl  vector  and  82lF/8sds‘  is  an  (m  —  1)  x  (m  —  1)  matrix. 
Verify  that 
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ggTO  d&(Fo)\  =  _  J^L_ 

5s  3s'  )  3s  3s' 


_  I  y  ^ hm)  \  =  UU* 

~  \A  _  i Ph  ('LT-iti*sl)2  A»- i)x(«-i) 


where 


/  (3g  ~3lm) 


17= 


V 


(^Km-n  3tm)  -Jpx 

IX-,<5.*4 


(3gi  —  3ffm)  -y/Pff  \ 

\ 

(3^(m  - 1) —  3^m)  I 

H-.W  / 


We  now  show  that  J  is  nonsingular.  Let  Xj  be  the  upper-right  vertex  of 
Aj,  j=  1, ...,  m- 1.  By  reordering  the  IJs,  WLOG,  we  can  assume  that  the 
upper-right  vertex  of  I,  is  equal  to  x„  /=  1, ...,  m  — 1.  Thus  I,r\Aj=0  for 
j> i  =  i, m-l.  Then  the  matrix  U has  the  upper  triangle  matrix  from 


U= 


0 


VK 

•*2  +  ^21^1 


(Sffl—Spm)  y/Pfi 

£?-i 

($02  ~~$0m)  \fPfi 


\ 


o 


o 


\J Pm  —  1 


\fPfi  I 


Recall  ^>0  and  p,>0  for  i=  1,  It  foUows  that  the  matrix  U  is 

of  full  rank  and  J=  UU‘  is  nonsingular. 

It  is  easy  to  verify  that 


gW„)  /32if(F0)\  j 

3s  3s'  V  SsSs>  J 


It  thus  follows  that 


3  2{Fn) 


dnFp) 

3s 


-JA„  +  op{  \\AJ), 


3s 
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where  A  is  the  (m  —  1  )-dimensional  column  vector  with  entries  S,  s°  - 
pPfAt)  i  =  1,  m-\.  Let  fl„  =  {inf,<m  S,  =  0} .  Verify  that 

o  =  except  on  the  event  Q„, 


and  by  Theorem  1  and  Assumptions  (4.1)  and  (4.2), 

P(£2„)  -*  0  as  «-»Go. 

It  follows  from  the  CLT  that  Jn  (d&(F0)/d s)  is  asymptotically  normal 
with  mean  0  and  dispersion  matrix  J.  This  shows  that  A„-J  x 
(3j5f(F0)/5s)  +  op{n~l/2).  Thus  we  have  the  following  result. 


Theorem  3.  Under  Assumptions  (4.1),  (4.2)  and  (4.3), 


is  asymptotically  normal  with  mean  0  and  dispersion  matrix  J  .  A  strongly 
consistent  estimator  of  J  is  given  by  J=  -  (d2<? (Fn)/ds  3s').  Furthermore, 
yjn  [  Fn(x)  -F0(x)]  is  asymptotically  normally  distributed  for  a//jc  esx*.  A 
consistent  estimate  of  the  asymptotic  variance  of  Fn(x)  is  (1/n)  cv  c,  where 
c  is  a  (m-l)xl  vector  with  the  ith  entry  c,- l(At  c  [0,  xj  x  [0,  x2]) 
unless  F0(x)  =  1. 


Under  the  assumptions  in  Theorem  3,  the  GMLE  is  also  asymptotically 
efficient.  The  proof  of  this  assertion  is  straightforward  and  is  omitted. 
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Abstract.  In  this  paper  we  consider  an  interval  censorship  model  in  which  the  end¬ 
points  of  the  censoring  intervals  are  determined  by  a  two  stage  experiment.  In  the  first 
stage  the  value  k  of  a  random  integer  is  selected;  in  the  second  stage  the  endpoints  are 
determined  by  a  case  k  interval  censorship  model.  We  prove  the  strong  consistency  in 
the  Li(/i)-topology  of  the  nonparametric  maximum  likelihood  estimate  of  the  underlying 
survival  function  for  a  measure  /i  which  is  derived  from  the  distributions  of  the  endpoints. 
This  consistency  result  yields  strong  consistency  for  the  topologies  of  weak  convergence, 
pointwise  convergence  and  uniform  convergence  under  additional  assumptions.  These 
results  improve  and  generalize  existing  ones  in  the  literature. 

Short  Title:  Interval  censorship  model. 
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1.  Introduction 


In  industrial  life  testing  and  medical  research,  one  is  frequently  unable  to  observe  the  random 
variable  X  of  interest  directly,  but  can  observe  a  pair  (L,  R)  of  extended  random  variables  such 
that 

— oo  <  L  <  X  <  R  <  oo. 


For  example  consider  an  animal  study  in  which  a  mouse  has  to  be  dissected  to  check  whether  a 
tumor  has  developed.  At  the  time  of  dissection  we  can  only  infer  whether  the  tumor  is  present, 
or  has  not  yet  developed.  Thus,  if  we  let  X  denote  the  onset  of  tumor  and  Y  the  time  of  the 
dissection,  then  the  corresponding  pair  (L,  R)  is  given  by 


(-oo,  Y), 
(Y,  oo), 


X<  Y, 
X  >Y. 


If  X  and  Y  are  independent,  then  this  model  is  called  the  case  1  interval  censorship  model  (Groene- 
boom  and  Wellner  (1992))  and  the  data  pair  (L,  R)  is  usually  replaced  by  the  current  status  data 
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(Y,I[X  <  y]),  where  I[A\  is  the  indicator  function  of  the  set  A.  Examples  of  the  current  status 
data  are  mentioned  in  Ayer  et  al.  (1955),  Keiding  (1991)  and  Wang  and  Gardiner  (1996). 

Another  interval  censorship  model  is  the  case  2  model  considered  by  Groeneboom  and  Wellner 
(1992).  Consider  an  experiment  with  two  inspection  times  U  and  V  such  that  U  <  V  and  (U,V)  is 
independent  of  X.  One  can  only  determine  whether  X  occurs  before  time  U,  between  times  U  and 
V  or  after  time  V.  More  formally,  one  observes  the  random  vector  (U,  V,  I[X  <  U],I[U  <  X  <  F]). 
In  this  model 

f  (-00,17),  X<u, 

(L,i?)=i  (17, F),  U<X<V, 

[  (V,oo),  X>F 

Note  that  ( L,R )  is  a  function  of  the  random  vector  (U,  V,  I[X  <  U],I[U  <  X  <  V]).  However,  F 
cannot  be  recovered  from  the  pair  (L,  R)  on  the  event  {X  <  U}.  Thus  the  pair  (L,  R)  carries  less 
information  than  the  vector  (U,  V,  I[X  <  U],I[U  <  X  <  F]). 

The  case  1  and  case  2  models  are  special  cases  of  the  case  k  model  (Wellner,  1995).  In  this 
model  there  are  k  inspection  times  Yi  <  •  •  •  <  Y*.  which  are  independent  of  X,  and  one  observes 
into  which  of  the  random  intervals  (— oo,  Yi], . . . ,  (Y*,,  oo)  the  random  variable  X  belongs.  Note 
that  the  case  k  model  for  k  >  2  can  be  formally  reduced  to  a  case  2  model  with  U  and  F  functions 
of  X  and  the  inspection  times  Yi, . . . ,  Yj,.  The  resulting  U  and  F  are  then  no  longer  independent 
of  X  violating  a  key  assumption  used  in  deriving  consistency  results  for  the  case  2  model. 

While  the  case  1  model  gives  a  good  description  of  the  animal  study  mentioned  above,  a  data 
set  from  a  case  k  model  (fc  >  2)  is  difficult  to  find  in  medical  research  since  it  is  very  unlikely  that 
every  patient  under  study  has  exactly  the  same  number  of  visits.  Finkelstein  and  Wolfe  (1985) 
presented  a  closely  related  type  of  interval-censored  data  in  comparing  two  different  treatments  for 
breast  cancer  patients.  The  censoring  intervals  arose  in  the  follow-up  studies  for  patients  treated 
with  radiotherapy  and  chemotherapy.  The  failure  time  X  is  the  time  until  cosmetic  deterioration 
as  determined  by  the  appearance  of  breast  retraction.  Each  patient  had  several  follow-ups  and 
the  number  of  follow-ups  differed  from  patient  to  patient.  One  only  knows  that  the  failure  time 
occurred  either  before  the  first  follow-up,  or  after  the  last  follow-up  or  between  two  consecutive 
follow-ups.  Other  examples  of  such  type  of  interval-censored  data  can  be  found  in  AIDS  studies 
(Becker  and  Melbye  (1991);  Aragon  and  Eberly  (1992)). 

In  this  paper  we  assume  that  the  pair  (L,  R)  is  generated  as  a  mixture  of  case  k  models.  This 
formulation  encompasses  the  various  case  k  models  and  the  data  setting  occurring  in  Finkelstein 
and  Wolfe  (1985).  A  precise  definition  of  this  mixture  model  is  given  in  Section  2. 

Let  Fq  denote  the  unknown  distribution  function  of  X.  This  distribution  function  is  commonly 
estimated  by  the  generalized  maximum  likelihood  estimate  (GMLE).  Ayer  et  al.  (1955)  derived  an 
explicit  expression  of  the  GMLE  for  the  case  1  model.  However,  in  general  the  GMLE  does  not 
have  an  explicit  solution.  In  deriving  a  numerical  solution  for  the  GMLE,  Peto  (1973)  used  the 
Newton- Raphson  algorithm;  Turnbull  (1976)  proposed  a  self-consistent  algorithm;  Groeneboom 
and  Wellner  (1992)  proposed  an  iterative  convex  minorant  algorithm.  A  detailed  discussion  of 
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some  computational  aspects  is  given  in  Wellner  and  Zhan  (1997). 

Various  consistency  results  are  available  for  the  GMLE.  In  the  case  1  model,  Ayer  et  al. 
(1955)  proved  the  weak  consistency  of  the  GMLE  at  continuity  points  of  Fq  under  additional 
assumptions  on  G,  the  distribution  function  of  Y.  The  uniform  strong  consistency  of  the  GMLE 
has  been  established  by  Groeneboom  and  Wellner  (1992),  van  de  Geer  (1993,  Example  3.3a), 
Wang  and  Gardiner  (1996)  and  Yu  et  al.  (1998a)  for  continuous  Fq  using  various  assumptions  and 
techniques.  In  the  case  2  model,  the  uniform  strong  consistency  of  the  GMLE  has  been  established 
by  Groeneboom  and  Wellner  (1992),  van  de  Geer  (1993,  Example  3.3b),  and  Yu  et  al.  (1998b)  for 
continuous  Fq. 

In  Section  2  we  shall  obtain  the  strong  L\  (^)-consistency  of  the  GMLE  for  our  mixture  of 
case  k  models  for  some  measure  /j.  This  result  shows  that  the  L\ (/i)-topology  is  the  appropriate 
topology  as  it  gives  consistency  without  additional  assumptions  in  the  case  k  models.  Convergence 
in  stronger  topologies  such  as  the  topologies  of  weak  convergence  and  uniform  convergence  requires 
additional  conditions.  This  is  pursued  in  Section  3.  In  the  process  we  also  point  out  some  erroneous 
consistency  claims  in  the  literature.  The  proof  of  the  L\ (/^-consistency  is  given  in  Section  4.  It 
exploits  the  special  structure  of  the  likelihood  for  this  model  and  does  not  require  any  advanced 
theory.  Section  5  collects  various  other  proofs. 


2.  Main  Results 

We  begin  by  giving  a  precise  definition  of  our  model.  This  is  done  by  describing  how  the 
endpoints  L  and  R  are  generated.  Let  A"  be  a  positive  random  integer  and  Y  =  {Yk,j  '  k  = 
1,2,...,  j  =  1, . . . ,  k}  be  an  array  of  random  variables  such  that  Yk.i  <  ■■•  <  Yk,k-  Assume 
throughout  that  (K,  Y)  and  X  are  independent.  On  the  event  {K  =  k},  let  ( L,R )  denote  the 
endpoints  of  that  random  interval  among  (— oo,  (Yfc.i,  Tfc.aL  ■  •  •  •  (Y/.^,  oo)  which  contains  X. 

We  refer  to  this  model  as  the  mixed  case  model  as  it  can  be  viewed  as  a  mixture  of  the  various  case 
k  models. 

In  some  clinical  studies,  an  examination  is  performed  at  the  start  of  the  study  and  follow-ups 
are  scheduled  one  at  a  time  till  the  end  of  the  study.  This  can  be  modeled  by  taking  =  Ei= i  & 
and  K  =  sup{fc  >  1  :  Ei=i  &  <  r},  where  £i,£2j---  denote  the  (positive)  inter-follow-up  times 
and  r  is  the  length  of  the  study.  In  this  case  K  may  not  be  bounded.  For  example,  if  the  inter¬ 
follow-up  times  are  independent  with  a  common  exponential  distribution,  then  K  —  1  is  a  Poisson 
random  variable;  thus  K  is  unbounded,  yet  E(K)  <  oo.  In  general,  if  the  inter-follow-up  times  are 
independent  and  identically  distributed,  then  E(K)  <  oo. 

To  define  the  GMLE,  let  (L\ ,  i?i), . . . ,  (Ln,  Rn)  be  independent  copies  of  the  pair  (L,  R)  defined 
above  and  define  the  generalized  likelihood  function  An  by 

n 

K(F)  =  '[[[F(Rj)-F(Lj)],  FGjP- 

3= 1 
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where  T  is  the  collection  of  all  nondecreasing  functions  F  from  [—00,  +00]  into  [0, 1]  with  F(— 00)  = 
0  and  F(+oo)  =  1.  We  think  of  Fo  as  a  member  of  T.  Note  that  A n(F)  depends  on  F  only  through 
the  values  of  F  at  the  points  Lj  or  Rj,  j  =  1, ...  ,n.  Thus  there  exists  no  unique  maximizer  of 
An  (F)  over  the  set  T.  However,  there  exists  a  unique  maximizer  Fn  over  the  set  T  which  is  right 
continuous  and  piecewise  constant  with  possible  discontinuities  only  at  the  observed  values  of  Lj 
and  Rj,  j  =  1, . . . ,  n.  We  cedi  this  maximizer  Fn  the  GMLE  of  Fo. 

Define  a  measure  /. 1  on  the  Borel  cr-field  B  on  R  by 

00  k 

fi(B)  =  £  P(K  =  k)  £  P(Yk,j  e  B  \  K  =  k),  B  G  B. 

k= 1  j= 1 

We  are  now  ready  to  state  our  main  result,  namely  the  (strong)  L\(p)  consistency  of  the  GMLE. 

2.1.  Theorem.  Let  E(K)  <  00.  Then  f  \Fn  —  Fo|  dp  -4  0  almost  surely. 

The  condition  E(K)  <  00  implies  the  finiteness  of  the  measure  p  and  of  the  expectation 
F[log(Fo(i?)  —  Fo(L))].  These  two  latter  conditions  play  an  important  role  in  our  proof  given  in 
Section  4. 

One  referee  pointed  out  that  results  of  van  de  Geer’s  (1993)  (namely  her  Lemma  1.1  and 
Theorem  3.1)  may  be  used  to  prove  a  result  very  similar  to  our  Theorem  2.1  with  the  help  of  some 
inequalities  suggested  by  this  referee.  This  alternative  proof  leads  to  L\  (^-consistency  for  some 
finite  measure  p  that  is  equivalent  to  our  measure  p  and  does  not  require  the  finiteness  of  E{K). 
Actually,  such  a  result  implies  our  result  in  view  of  the  following  simple  lemma  which  we  state 
without  a  proof. 

2.2.  Lemma.  Let  p  1  and  // 2  be  two  finite  measures  and  g,gi,g2,-  be  measurable  functions 
into  [0, 1].  Suppose  that  fi2  Is  absolutely  continuous  with  respect  to  p\.  Then  f  \gn  —  g\  dp  1  — >  0 
implies  f  \gn  -  p|  dp2  -*•  0. 

We  have  decided  to  present  our  original  proof  since  it  is  direct  and  elementary  and  since 
E(K)  <  00  is  a  rather  mild  assumption  that  is  typically  satisfied  in  applications. 

In  the  remainder  of  this  section  we  mention  some  corollaries  of  Theorem  2.1.  The  first  one  is  of 
interest  when  the  inspection  times  axe  discrete.  It  follows  from  the  fact  that  /i({a})|Fn(a)— Fo(o)|  < 
f  |  Fn  —  Fq  I  dp  for  every  a  €  E  and  generalizes  the  consistency  results  given  in  Yu  et  al.  (1998a, b) 
for  the  case  1  and  case  2  models  with  discrete  inspection  times. 

2.3.  Corollary.  Let  E(K)  <  00.  Then  Fn(a )  — »•  Fo(a)  almost  surely  for  each  point  a  with 
p({a})  >  0. 

In  the  next  corollary  we  state  results  for  a  measure  v  that  depends  on  the  distribution  of  L 
and  R  and  is  easier  to  interpret  than  p.  We  take  v  to  be  the  sum  of  the  marginal  distributions  of 
L  and  R: 

v{B)  =  P(L  e  B)  +  P(R  €  B),  BeB. 
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In  view  of  the  set  inclusion 


oo  k 

{LeB}U{ReB}c  U  \J{K  =  k,Yk>i  e  B}, 

fc= 1  t=l 

we  have  v(B)  <  2p(B).  Thus  we  immediately  get  the  following  corollary. 
2.4.  Corollary.  Let  E(K )  <  oo.  Then  the  following  axe  true. 

(1)  f  |F„  —  Fo|  du  — >  0  almost  surely. 

(2)  Fn(a)  — >  Fq(o)  almost  surely  for  each  point  a  with  v({a})  >  0. 


3.  Other  Consistency  Results 

In  this  section  we  shall  show  that  under  additional  assumptions  strong  L\ (/^-consistency 
implies  strong  consistency  in  other  topologies  such  as  the  topologies  of  weak  convergence,  pointwise 
convergence  and  uniform  convergence.  Throughout  we  always  assume  that  E(K)  is  finite  so  that 
p  is  a  finite  measure  and  P(flM)  =  1  by  Theorem  2.1,  where 

%  =  {  lim  [  | F„  -  To |  dp  =  0}. 

n—y oo  J 

Although  the  results  of  this  section  are  formulated  for  the  measure  p  defined  in  the  previous 
section,  they  are  true  for  any  finite  measure  for  which  the  GMLE  is  strongly  L\ -consistent  as  only 
the  finiteness  of  p  and  P(fl^)  =  1  are  used  in  their  proofs.  These  proofs  are  deferred  to  Section  5. 

Let  a  be  a  real  number.  We  call  a  a  support  point  of  p  if  p((a  —  e,  a  +  e))  >  0  for  every  e  >  0. 
We  call  a  regular  if  p((a  —  e,  a])  >  0  and  p([a,  a  +  e))  >  0  for  all  e  >  0.  We  call  a  strongly  regular 
if  p((a  —  e,  a))  >  0  and  p([a,a  +  e))  >  0  for  all  e  >  0.  We  call  a  a  point  of  increase  of  F0  if 
Fo(a  +  e)  —  Fo(a  —  e)  >  0  for  each  e  >  0. 

In  view  of  the  inequality  v  <  2p,  sufficient  conditions  for  the  first  three  of  the  above  concepts 
are  obtained  by  replacing  p  be  v.  As  these  sufficient  conditions  are  in  terms  of  the  distribution  of 
L  and  R,  they  are  easier  to  interpret  and  thus  more  meaningful  from  an  applied  point  of  view. 

Ayer  et  al.  (1955)  established  the  weak  consistency  of  the  GMLE  at  regular  continuity  points  of 
To  in  the  case  1  model.  Our  first  proposition  gives  a  strong  consistency  result  for  regular  continuity 
points  in  our  more  general  model. 

3.1.  Proposition.  For'each  ui  E  and  each  regular  continuity  point  a  of  Fq,  Fn(a ;  u)  To(a). 

The  next  two  propositions  address  weak  convergence  on  an  open  interval  and  on  the  entire 

line. 

3.2.  Proposition.  Suppose  every  point  in  an  open  interval  ( a,b )  is  a  support  point  of  p.  Then 
Fn(x;  u)  -4  F0(x)  for  every  continuity  point  x  of  F0  In  (a,  b)  and  every  ui  £  If  also  To(a)  =  0 
and  Fo(b—)  =  1,  then  Fn(x;u)  -4  Fq(x)  for  all  continuity  points  x  of  Fq  and  all  u  € 
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3.3.  Proposition.  If  every  point  of  increase  of  Fq  is  strongly  regular,  then  Fn(x;  w)  — >  Fq(x)  for 
all  continuity  points  of  Fo  and  all  u>  € 

Combining  these  propositions  with  Corollary  2.3  yields  the  following  results  on  pointwise  con¬ 
vergence  on  open  intervals  and  on  the  entire  line. 

3.4.  Corollary.  Suppose  every  point  x  in  an  open  interval  ( a,b )  is  a  support  point  of  p  and 
satisfies  p({x})  >  0  if  x  is  a  discontinuity  point  of  Fq.  Then  Fn(x;  oj)  —>  Fq(x)  for  every  x  in  (a,  b ) 
and  every  u  €  Moreover,  if  Fo(a)  =  0  and  Fo(b— )  =  1,  then  Fn(x;  u>)  —¥  Fq(x)  for  all  x  €  E 
and  all  u>  6 

3.5.  Corollary.  If  every  point  of  increase  of  Fo  is  strongly  regular  and  if  /i({a})  >  0  for  each 
discontinuity  point  a  of  Fo,  then  Fn(x‘,uj)  -4  Fq(x)  for  all  x  €  R  and  all  oj  G  f 

The  next  proposition  addresses  uniform  convergence. 

3.6.  Proposition.  Suppose  that  Fo  is  continuous  and  that,  for  all  a  <  b,  0  <  Fo(a)  <  Fo(b)  <  1 
implies  p((a,b))  >  0.  Then  the  GMLE  is  uniformly  strongly  consistent,  i.e., 

sup  |F„(a:)  -  Fo(a:)|  0  a.s.. 


This  proposition  generalizes  the  strong  uniform  consistency  results  given  by  Groeneboom  and 
Wellner  (1992)  for  the  case  1  and  2  models.  In  the  case  1  model  they  require  that  Fo  and  G, 
the  distribution  function  of  Y,  are  continuous  and  that  the  probability  measure  pf0  induced  by 
Fo  is  absolutely  continuous  with  respect  to  p  (pF0  «  I1)-  Proposition  3.6  does  not  require  the 
continuity  of  G  and  weakens  the  absolute  continuity  requirement.  In  the  case  2  model  Groeneboom 
and  Wellner  assume  that  Fo  is  continuous  and  that  the  joint  distribution  of  U  and  V  has  a  Lebesgue 
density  g  such  that  g(u,  v)  >  0  if  0  <  Fo(u)  <  Fo(v)  <  1.  Their  assumption  implies  that  the  measure 
p  has  a  Lebesgue  density  which  is  positive  on  the  set  {t :  0  <  Fo(t)  <  1}  and  therefore  implies  that 
p((a,b))  >  0  if  0  <  Fo(o)  <  Fo(6)  <  1.  Consequently,  Proposition  3.6  improves  and  generalizes 
their  result. 

Proposition  3.6  also  generalizes  the  strong  uniform  consistency  results  given  by  van  de  Geer 
(1993)  for  the  case  1  and  2  models  under  the  assumption  that  Fo  is  continuous  and  pp0  «  p-  The 
latter  implies  that  p((a,b))  >  0  if  0  <  Fo(a)  <  Fo(6)  <  1.  However,  if  p  is  discrete,  its  support  is 
dense  in  (0, +oo),  and  Fo  is  exponential,  then  the  assumption  in  Proposition  3.6  is  satisfied,  but 
Pf0  «  P  is  not  true. 

In  clinical  follow-ups,  the  studies  typically  last  for  a  certain  period  of  time,  say  It  is 

often  that  Fo(t2)  <  1  in  which  case  the  conditions  in  Proposition  3.6  are  not  satisfied.  In  this 
regard,  Gentleman  and  Geyer  (1994)  claimed  a  vague  convergence  result  in  their  Theorem  2  and 
Huang  (1996)  claimed  a  uniform  strong  consistency  result  in  his  Theorem  3.1.  Both  of  their  results 
as  stated  imply  the  uniform  strong  consistency  of  the  GMLE  on  [71,7-2]  in  the  case  1  model,  if  Fq 
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is  continuous  and  the  inspection  time  Y  is  uniformly  distributed  on  [ti,  t 2].  The  following  example 
shows  that  this  is  not  true. 

3.7.  Example.  Consider  current  status  data  ( Y\,I[X\  <  Yi]), . . . ,  (Yn,  I[Xn  <  Yn]),  where  the 

survival  times  Xi, . . . ,  Xn  are  uniformly  distributed  on  [0, 3]  and  the  inspection  times  Y\, . . . ,  Yn  are 
uniformly  distributed  on  [1,2].  Then  Fo  is  the  uniform  distribution  function  on  [0,3]  and  /i  is  the 
uniform  distribution  on  [1,2].  Note  that  on  the  event  (J”-i {Xj  >2  >  Yj,Yj  <Yi,i  =  l,...,n,i  / 
j}  we  have  F„(  1)  =  0,  and  on  the  event  <  1  <  Yj,  Yj  >  Yi, i  =  1, . . . ,  n,  i  ^  j}  we  have 

F„( 2)  =  Fn( 2—)  =  1.  Both  events  have  probability  1/3.  Since  Fo(l)  =  1/3  and  Fo(2)  =  Fo(2— )  = 
2/3,  we  see  that  Fn(x)  does  not  converge  to  Fq(x)  almost  surely  for  x  =  1,2  and  Fn( 2—)  does  not 
converge  to  Fo(2— )  almost  surely.  This  shows  that  pointwise  convergence  on  the  closed  interval 
[ri,r2]  to  a  continuous  Fo  is  not  implied  by  the  condition:  /i([a,  b})  >  0  for  all  a  and  b  such  that 
ri  <  a  <  b  <  r2. 

The  following  proposition  indicates  how  to  fix  the  assumptions. 

3.8.  Proposition.  Suppose  the  following  four  conditions  hold  for  real  numbers  ri  <  r2. 

(1)  Fo  is  continuous  at  every  point  in  the  interval  (ti,t2]; 

(2)  either  /x({ti})  >  0  or  Fo(ri)  =  0; 

(3)  either  p({r2})  >  0  or  F0(t2-)  =  1; 

(4)  for  all  a  and  b  in  (ti,t2),  0  <  Fo(o)  <  Fo(b)  <  1  implies  p((a,b))  >  0. 

Then  the  GMLE  is  uniformly  strongly  consistent  on  [tl,t2],  i.e., 

sup  |F„(a:)  -  F0(x)|  ->  0  a.s.. 

x£[ti,t2] 


4.  Proof  of  Theorem  2.1 

Recall  that  L  may  take  the  value  —00  and  R  the  value  +00.  The  normalized  log-likelihood  is 

C-(F)  =  ^  E  log  lF(Rj)  -  F(Lj)],  F  €  T. 
ni= 1 

By  the  strong  law  of  large  numbers  (SLLN),  Cn(F)  converges  almost  surely  to  its  mean 

00 

£(F)  =  F(log  [F(R)  -  F(L )])  =  ^P(Jf  =  k)E(hF,k(Yk,u  •••,  Yk,k)  \  K  =  k), 

k=l 

where 

fc 

hF,k{yu  ■  •  • ,  yk)  =  Y^(Fo(yj+ 1)  -  Fo(vj))  l°s(F(yj+i)  -  F(yj )). 

3= 0 
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for  —oo  =  yo  <  Vi  <■■•<  Vk  <  Vk+i  =  oo.  Here  and  below  we  interpret  OlogO  =  0  and 
logO  =  — oo. 

It  is  easy  to  check  that,  for  each  positive  integer  k  and  real  numbers  y\  <  •  •  •  <  yk ,  the 
expression  hp,k(yii  yk )  is  maximized  by  a  function  F  E  T  if  and  only  if  F(yj)  =  Fo(yj)  for 
j  =  1  Since  sup{|plogp|  :  0  <  p  <  1}  <  1,  |/ip0)k|  is  bounded  by  k.  Since  K  has  finite 

expectation,  we  see  that  £(Fo)  is  finite.  Hence  Fo  maximizes  £(•)  over  the  set  T  and  any  other 
function  F  E  T  that  maximizes  £(•)  satisfies  that  F  =  Fo  a.e.  fi. 

Let  {Fn}  be  a  sequence  in  T.  By  a  pointwise  limit  of  this  sequence  we  mean  an  F  E  F  such 
that  Fni(x)  -4  F(x)  for  all  x  E  R  and  some  subsequence  {«'}.  Helly’s  selection  theorem  (Rudin 
(1976),  pg  167)  guarantees  the  existence  of  pointwise  limits.  Let  now  fi'  be  the  set  of  all  sample 
points  uj  for  which  the  sequence  {Fn(-;a;)}  has  only  pointwise  limits  F  such  that  £(F)  >  £(Fo).  In 
view  of  the  above  discussion,  for  each  uj  E  fi',  all  the  limit  points  of  {Fn(-;u>)}  equal  Fo  a.e.  /j,  and 
this  gives  that  f  \Fn(x;  uj)  —  Fo(x)|  dfi(x)  -4  0.  Thus  the  desired  result  follows  if  we  show  that  fi' 
has  probability  1.  Let  Qn  denote  the  empirical  estimator  of  Q,  the  distribution  of  ( L,R ).  By  the 
SLLN,  fio  =  (£n(F0)  — »  £(Fo)}  has  probability  1,  and  so  does  fi u  =  {Qn(U)  -4  Q(U)}  for  every 
Borel  subset  U  of  A  =  {(l,  r)  :  — oo  <l<r<  oo}.  Thus  we  are  done  if  we  show  that  fi'  contains 
the  intersection  fi*  of  fio  and  f\Ueu  flu  for  some  countable  collection  U  of  Borel  subsets  of  A. 

Let  a  be  a  positive  integer.  Then  there  are  finitely  many  extended  real  numbers 


-00  =  qo  <  Qi  <  <72  <  •  •  •  <  qn  —  OO 


such  that  <  2~a  for  i  =  1, . . . , 0.  Now  form  the  sets  Uo,  ■  ■  ■ ,  U20  by  setting  f/^-i  = 

(9i-i>  Qi)  for  i  =  1, . . . ,  /3,  and  U^i  =  [qi,  qi)  for  i  =  0, . . . ,  0.  Let  Ua  denote  the  collection  of  all 
nonempty  sets  of  the  form  Uij  =  A  fl  (Ui  x  Uj)  for  0  <  i  <  j  <  2/3.  We  shall  take  U  =  (JQ  Ua . 

Let  now  uj  belong  to  fi*.  Let  Fn  denote  the  distribution  function  defined  by  Fn(x)  =  Fn{x\ uj) 
and  Qn  the  measure  defined  by  Qn(A)  =  Qn(A;u).  Let  F  be  a  pointwise  limit  of  {F„}.  For 
simplicity  in  notation  we  shall  assume  that  Fn(x)  -4  F(x)  for  all  i£l.  We  shall  show  that 

£(F0)  <  liminf  £n(Fn)(w)  <  limsup£n(F„)(o;)  <  £(F). 

71-400  n— >-oo 


The  first  inequality  follows  from  C,n{Fn)(uj)  >  Cn(Fo)(ix),  a  consequence  of  the  definition  of  the 
GMLE,  and  the  fact  that  Cn(Fo)(u>)  -4  £(Fo)  by  the  choice  of  uj.  Thus  we  only  need  to  establish 
the  last  inequality.  For  this  note  that  Cn(Fn)(uj)  can  be  expressed  as 

[  log  [Fn(r)  -  Fn(l)]dQn(l,r). 

J  A 

The  desired  inequality  is  thus  equivalent  to 

limsup  f  log  [Fn(r)  -  Fn(l)\dQn(l,r)  <  [  log  [F(r)  -  F(l)]dQ(l,r).  (4.1) 

n— ►  00  J&  J& 


8 


Now  fix  a  positive  integer  a  and  a  negative  integer  q.  Then 

[  log  [Fn(r)-Fn(l))dQn(l,r)<  f  q  Vlog  [Fn(r)  -  Fn(l)\  dQn(l,r) 

J  A  J  A 

<  Y,  Mn(U)Qn(U), 
ueua 

where 

Mn(U)=  sup  q  V  log  [Fn(r)  -  Fn(l)] 

(l,r)eu 

and  U  is  the  closure  of  U.  It  is  easy  to  check  that  Mn(U )  =  q  V  log  [Fn(ru)  —  Fn(lu)],  where 
rj/  =  sup{r  :  (l,  r )  £  17}  and  lu  =  inf{l :  (l,  r)  £  U}.  Thus 


Mn(U)  M(U)  :=  q  V  log  [F(rp)  -  F(ltr)]  =  sup  q  V  log  [F(r)  -  F(Z)]. 

(i,r)eu 

Also,  by  the  choice  of  to,  Qn(U)  -4  Q(U)  for  all  U  £  Ua.  Therefore  we  can  conclude  that 

Y  Mn(U)Qn(U)  -4  £  M(U)Q(U). 
ueua  ueua 


Let  now 


Using  the  bound 


m(U)  =  inf  _  q  Vlog  [F(r)  -  F(l)],  U  eUa. 

(i l,r)£U 

q  V  log(x)  -  q  V  log(y)|  <  e_9|x  -  y\,  0<x,y<l, 


it  is  easy  to  verify  that 

M(U)-m(U)<e~q  sup  [F(ru)  -  F(r)  +  F(l)  -  F(lv)\,  U  GUa. 

(l,r)eu 

This  shows  the  following. 

(1)  If  U  =  Afl [(&_!, Qj)  x  (qj„i,qj)],  then  M(U)  —  m(U)  >  2/a  implies  either  F(qi)  —  F(qi_i)  > 
eq/a  or  F(qj)  -  F(qj- 1)  >  eq/a ; 

(2)  if  U  =  A  n  [[<?*,  qi]  x  Qj)],  then  M ( U )  -  m(U)  >2/ a  implies  F(qj)  -  F(qj- 1)  >  e9/a; 

(3)  if  U  =  A  n  [(qi-i,  qi)  x  [qr^,  g,-]],  then  M(U)  -  m(U)  >  2/a  implies  F(q j)  -  F(g;_ i)  >  eq/a. 

Of  course,  if  17  contains  only  one  point,  then  M(U)  —  m(U)  =  0.  Using  this,  we  derive 

Y  (M(P)  -  m(U))Q(U)  <-  +  M  Y  Q(U)I[(M{U)  -  m(U))  >  2/a] 
ueua  a  ueua 

o  f* 

<-  +  |«|  £P(*-i  <L<  qi)I[F(gi)  -  F (*_i)  >  eq/a] 

i= 1 
/ ? 

+  p  >  eV«] 

J=1 

<-  +  |g|(l  +  ae-«)21-“. 
a 
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In  the  last  step  we  use  the  facts  that 


P{Qi- 1  <L<qi)  +  P(qi-i  <R<qi)<  2^((gf_i,gi»  <  21  “ 


and  that  at  most  1  +  ae  q  among  the  terms  F{qi)  -  F(qo),  •  •  • ,  F(qp)  -  F(qp-\)  exceed  eq/a. 
Combining  the  above  shows  that 

limsup  f  log  [Fn(r)  -  Fn(l)]  dQn{l,r) 

n-¥oc  J  A 

<  f  qV\og[F(r)  -  F(l)]dQ(l,r)  +  - +  \q\(l  +  ote~q)21~a. 

J  A  a 

The  desired  inequality  (4.1)  follows  from  this  by  first  letting  a  — oo  and  then  q  — >  — oo. 


5.  Proof  of  the  Propositions 

Fix  lo  €  Abbreviate  Fn(-;u>)  by  Fn.  Let  F  be  a  pointwise  limit  of  Fn.  Without  loss  of 
generality,  assume  that  lim^oo  Fn(x)  =  F(x)  for  all  x.  Set 

D  =  {x  €  R  :  F(x)  ^  Fq(x)}. 

Since  /  \Fn  —  Fo|  d/i  -¥  0  and  /x  is  a  finite  measure  in  view  of  the  assumption  E(K)  <  oo,  we  have 

M-°)  =  0. 

PROOF  OF  Proposition  3.1:  We  need  to  show  that  D  does  not  contain  regular  continuity  points  of 
Fq.  Let  xq  be  a  continuity  point  of  Fq.  If  xo  belongs  to  D,  then  F(xq)  ^  Fo(xo)  and  the  continuity 
of  Fq  at  xo  and  the  monotonicity  of  F  and  F0  yield  that  there  exists  a  positive  e  such  that  either 
(xo  —  e,xo]  or  [xo,xo  +  e)  is  contained  in  D.  Thus  either  ;u((x o  —  e,®o])  =  0  or  fx([x o,xo  +  e)]  =  0, 
and  xo  is  not  regular.  □ 

Proof  of  Proposition  3.2:  Let  xo  be  a  continuity  point  of  Fo  which  is  also  an  interior  point 
of  S,  the  set  of  support  points  of  fi.  Then  xo  does  not  belong  to  D\  otherwise,  there  exist,  for 
each  e  >  0,  support  points  x\  and  X2  of  /x  and  a  positive  rj  such  that  (xi  —  77,  xi  4-  77)  is  contained 
in  (xo  —  e.  xoj  and  (X2  —  //,  x-2  +  rj)  is  contained  in  [xo,xo  +  f)  and  this  leads  to  the  contradiction 
KD)  >  0  .  This  shows  that  F(x)  =  Fq(x)  for  all  continuity  points  x  of  Fo  that  belong  to  the 
interior  of  S  and  proves  the  first  part  of  Proposition  3.2.  The  second  part  follows  from  the  first 
part  and  the  monotonicity  of  F  and  Fq.  □ 

PROOF  OF  Proposition  3.3:  Suppose  every  point  of  increase  of  Fo  is  strongly  regular.  We  shall 
show  that  D  does  not  contain  continuity  points  of  Fo-  Let  x0  be  a  continuity  point  of  Fo-  If  xq  is 
a  point  of  increase  of  Fo,  then  it  is  strongly  regular  and  hence  regular  and  cannot  belong  to  D  by 
Proposition  3.1.  Suppose  now  xq  is  not  a  point  of  increase  of  Fo-  Then  again  xo  cannot  belong  to 
D.  Otherwise,  either  F(xq)  >  Fq(xq)  or  F(x 0)  <  Fo(xo)  and  we  shall  show  that  each  leads  to  the 
contradiction  n{D)  >  0.  In  the  first  case,  b  :=  sup{x  :  Fq(x)  =  Fq(xo)}  is  a  point  of  increase  of  Fq, 
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b>  x0  and  F(b—)  >  F(x0)  >  Fo(x0)  =  F0(b~);  thus  [x0,  b)  C  D  and,  since  b  is  strongly  regular  by 
our  assumption,  p(D)  >  p((xo,b))  >  0.  In  the  second  case,  a  :=  inf  {a;  £  R  :  Fo(x)  =  Fo(rro)}  is  a 
point  of  increase  of  Fo,  a  <  xq  and  F(a)  <  F(x o)  <  Fo(xo)  =  Fo(a);  thus  [a,  xo)  C  D  and,  since  a 
is  strongly  regular  by  our  assumption,  p(D)  >  p([a,  x0))  >  0.  This  shows  that  D  does  not  contain 
continuity  points  of  Fo,  which  is  the  desired  result  of  Proposition  3.3.  □ 

Proof  of  Proposition  3.6:  Make  the  assumptions  of  Proposition  3.6.  Then  D  is  empty;  oth¬ 
erwise,  we  can  use  the  continuity  of  Fo  to  construct  an  open  interval,  that  contains  a  point  of 
increase  of  Fo  and  is  contained  in  D,  and  arrive  at  the  contradiction  p{D)  >  0.  Since  D  is  empty, 
Fn  converges  to  Fo  pointwise  and  hence  uniformly  as  Fq  is  continuous.  This  proves  Proposition 
3.6.  □ 

Proof  of  Proposition  3.8:  We  shall  only  give  the  proof  in  the  case  /i({ri})  >  0  and  Fo(t2— )  =  1. 
We  shall  show  that  D  fl  [ti,T2]  =  0.  This  implies  that  Fn(x)  -4  Fo(a:)  for  all  x  €  [ri,  t^],  and,  by 
the  continuity  assumption  on  F0,  this  convergence  is  even  uniform  on  [ri,r2]. 

It  follows  from  Corollary  2.3  that  F(ri)  =  Fo(ri).  This  gives  the  desired  result  if  Fo(ti)  =  1. 
Thus  assume  from  now  on  that  Fo(ri)  <  1.  We  are  left  to  show  that  D\  =  D  fl  (ti,T2]  is  empty. 
If  Di  were  not  empty,  we  could  use  the  continuity  assumption  on  Fo,  the  monotonicity  of  Fo  and 
F  and  F(ti)  =  Fo(ri)  <  Fo(t2-)  =  1  to  show  that  D\  contains  an  open  interval  (0,6)  such  that 
0  <  Fo(a)  <  Fo(b)  <  1  and  ti  <  a  <  b  <  T2  and  arrive  at  the  contradiction  p(D)  >  p((a,  b ))  >  0. 
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a  consistent  estimator  of  the  asymptotic  variance  of  the  SCE  are  presented. 
A  proof  of  the  strong  consistency  of  the  SCE  is  also  presented.  Our  simu¬ 
lation  studies  indicate  that  the  estimate  of  the  asymptotic  variance  is  very 
close  to  the  true  value  even  with  moderate  sample  sizes  and  high  censoring 
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Dose-Ranging  Study  of  lndole-3-Carbinol  for  Breast 
Cancer  Prevention 

George  Y.C.  Wong,*  Leon  Bradlow,  Daniel  Sepkovic,  Stephanie  Mehl,  Joshua  Mailman, 
and  Michael  P.  Osborne 

Strang  Cancer  Prevention  Center,  New  York,  New  York 


Abstract  Sixty  women  at  increased  risk  for  breast  cancer  were  enrolled  in  a  placebo-controlled,  double-blind 
dose-ranging  chemoprevention  study  of  indole-3-carbinol  (I3C).  Fifty-seven  of  these  women  with  a  mean  age  of  47  years 
(range  22-74)  completed  the  study.  Each  woman  took  a  placebo  capsule  or  an  I3C  capsule  daily  for  a  total  of  4  weeks; 
none  of  the  women  experienced  any  significant  toxicity  effects.  The  urinary  estrogen  metabolite  ratio  of  2- 
hydroxyestrone  to  1 6a-hydroxyestrone,  as  determined  by  an  ELISA  assay,  served  as  the  surrogate  endpoint  biomarker 
(SEB).  Perturbation  in  the  levels  of  SEB  from  baseline  was  comparable  among  women  in  the  control  (C)  group  and  the  50, 
1 00,  and  200  mg  low-dose  (LD)  group.  Similarly,  it  was  comparable  among  women  in  the  300  and  400  mg  high-dose 
(HD)  group.  Regression  analysis  showed  that  peak  relative  change  of  SEB  for  women  in  the  HD  group  was  significantly 
greater  than  that  for  women  in  the  C  and  LD  groups  by  an  amount  that  was  inversely  related  to  baseline  ratio;  the 
difference  at  the  median  baseline  ratio  was  0.48  with  95  %  confidence  interval  (0.30, 0.67).  No  other  factors,  such  as  age 
and  menopausal  status,  were  found  to  be  significant  in  the  regression  analysis.  The  results  in  this  study  suggest  that  I3C  at 
a  minimum  effective  dose  schedule  of  300  mg  per  day  is  a  promising  chemopreventive  agent  for  breast  cancer 
prevention.  A  larger  study  to  validate  these  results  and  to  identify  an  optimal  effective  dose  schedule  of  I3C  for  long-term 
breast  cancer  chemoprevention  will  be  necessary.  J.  Cell.  Biochem.  Suppls.  28/29:1 11-11 6.  1 1998  wiley-iiss.  Inc. 

Key  words:  chemoprevention;  estrogen  metabolites;  surrogate  endpoint  biomarker 


Indole-3-carbinol  (I3C)  is  a  compound  pre¬ 
sent  in  cruciferous  vegetables  such  as  broccoli, 
Brussels  sprouts,  cabbage,  and  cauliflower.  This 
compound  has  been  shown  to  protect  against 
certain  chemical  carcinogens,  and  to  induce  the 
enzyme  P450A1,  which  is  responsible  for  the 
formation  of  the  estrogen  metabolite  2-hy- 
droxyestrone  [1].  Cell  culture  experiments  have 
shown  that  2-hydroxyestrone  acts  to  block  pro¬ 
liferation  and  inhibit  promotion  of  anchorage 
independent  growth  in  mouse  mammary  cells, 
while  its  competitive  counterpart  16a-hy- 
droxyestrone  acts  in  a  promotional  manner  [2,3]. 
Therefore,  the  ratio  of  2-hydroxyestrone  to  16a- 
hydroxyestrone,  as  determined  by  an  ELISA 
assay  [4],  is  a  potential  surrogate  endpoint  bio¬ 
marker  (SEB)  for  breast  cancer  prevention.  Two 
animal  studies  have  shown  that  elevating  the 
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estrogen  metabolite  ratio  protects  against  mam¬ 
mary  tumor  formation.  Bradlow  et  al.  [5]  showed 
this  to  be  the  case  in  the  C3HOuJ  model,  and 
Grubbs  et  al.  [6]  showed  this  in  the  DMBA- 
induced  rat  model.  In  the  latter  case,  protection 
was  almost  complete.  A  study  in  women  at 
various  levels  of  breast  cancer  risk  showed  that 
16a-hydroxyestrone  was  elevated  in  women  at 
greater  familial  risk  for  breast  cancer  [7].  The 
same  phenomenon  had  been  observed  in  mice 
at  different  levels  of  breast  cancer  risk  [8].  In  a 
recent  study,  women  who  had  a  low  metabolite 
ratio  due  primarily  to  the  presence  of  an  en¬ 
zyme  defect,  which  blocks  2-hydroxylation  of 
estradiol,  showed  a  10-fold  increase  in  breast 
cancer  incidence  [9].  The  ability  of  I3C  to  pro¬ 
mote  2-hydroxylation  has  been  demonstrated 
both  in  breast  cancer  cell  culture  experiments 
[10,11]  and  in  animal  studies  [5,6]. 

The  ability  of  I3C  to  induce  a  significant 
increase  in  2-hydroxylation  in  humans  in  a 
short  time  was  first  demonstrated  by  Mich- 
novicz  and  Bradlow  [12].  A3-month  trial  of  I3C 
at  400  mg  per  day  against  a  placebo  control  and 
a  high  fiber  diet  control  showed  that  the  metabo- 
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lite  shift  in  favor  of  2-hydroxylation  pathway 
was  sustained  over  the  entire  trial  period  and 
that  no  significant  adverse  effects  were  ob¬ 
served  [13].  The  results  from  these  studies  sug¬ 
gest  that  I3C  may  be  a  promising  chemopreven- 
tive  agent  for  breast  cancer  prevention.  We 
launched  a  short-term  dose-ranging  study  of 
I3C  in  women  at  increased  risk  for  breast  can¬ 
cer.  The  overall  aim  of  the  intervention  study 
was  to  determine  a  minimum  effective  dose 
(MED)  of  I3C,  which  will  not  exceed  the  safely 
tolerated  dose  of  400  mg  per  day  established 
[13]  and  which  will  result  in  a  sustained  in¬ 
crease  in  2-hydroxylation  over  a  4-week  trial 
period.  Five  doses  of  I3C  were  considered:  50, 
100,  200,  300,  and  400  mg.  A  secondary  objec¬ 
tive  of  the  study  was  to  assess  toxicity  effects  of 
I3C  when  taken  daily  for  4  consecutive  weeks. 
The  SEB  used  in  this  study  was  ratio  of  urinary 
2-hydroxyestrone  to  16a-hydroxyestrone.  Sixty 
women  were  recruited  in  the  study,  and  full 
compliance  was  obtained  in  57.  A  placebo- 
controlled,  double-blind  trial  design  was  adopted 
for  the  study.  MED  was  statistically  deter¬ 
mined  to  be  300  mg,  and  a  significant  difference 
was  established  in  the  up-regulation  of  the  SEB 
between  the  MED  group  and  the  placebo  group. 
No  significant  toxicity  effects  were  observed  in 
the  57  women  at  the  end  of  the  4-week  trial. 

STUDY  POPULATION 

Adult  women  in  good  general  health  but  at 
increased  risk  for  breast  cancer  were  candi¬ 
dates  for  the  dose-ranging  study.  A  woman  is 
considered  to  be  at  increased  risk  for  breast 
cancer  either  if  she  is  over  60  years  of  age  or  she 
has  a  family  history  of  the  disease  (at  least  one 
first-degree  relative  or  at  least  two  second- 
degree  relatives  with  a  history  of  breast  can¬ 
cer).  Women  who  have  had  a  diagnosis  of  lobu¬ 
lar  carcinoma  in  situ  or  atypia  hyperplasia  are 
also  considered  to  be  at  increased  risk  in  our 
study. 

A  number  of  exclusion  criteria  were  imposed 
in  order  to  minimize  the  chances  of  confounding 
the  outcome  of  the  particular  estrogen  biomar¬ 
ker  chosen.  These  included  thyroid  disorders, 
regular  cigarette  smoking  within  the  last  6 
months,  obesity  defined  as  25%  overweight  us¬ 
ing  the  nomograph  for  Body  Mass  Index,  severe 
anorexia,  breast  feeding,  pregnancy  or  inten¬ 
tion  to  become  pregnant  during  the  study  pe¬ 
riod.  In  addition,  women  who  have  had  any 
form  of  cancer  other  than  basal  or  squamous 


cell  carcinoma  of  the  skin,  or  carcinoma  in  situ 
of  the  cervix,  were  excluded  from  the  study. 
Finally,  women  who  regularly  consume  a  large 
amount  of  cruciferous  vegetables  were  also  ex¬ 
cluded  because  of  the  nature  of  our  intervention 
study. 

A  total  of  60  women  who  were  eligible  for  the 
trial  were  selected  from  over  100  women  who 
were  eager  to  participate  in  the  study.  Most  of 
the  women  came  from  the  New  York  metropoli¬ 
tan  area.  Each  eligible  woman  was  required  to 
sign  an  informed  consent  form  before  entering 
the  study. 

STUDY  DESIGN 

A  placebo-controlled,  double-blind  design  was 
adopted  for  the  dose-ranging  study.  Because  a 
rigorous  toxicity  analysis  had  not  been  previ¬ 
ously  carried  out,  a  dose-escalation  scheme  was 
used  in  the  dose  assignment  for  safety  consider¬ 
ations.  First,  ten  women  in  the  control  group 
were  given  placebo  capsules.  This  was  followed 
by  assignments  of  ten  women  to  each  of  the  five 
ascending  dose  groups. 

A  pre-menopausal  participant  was  asked  to 
schedule  her  appointment  within  3  days  after 
her  next  period  ended.  Every  participant  was 
asked  to  bring  in  two  first  morning  urine 
samples,  one  from  the  morning  prior  to  the 
appointment  and  the  other  from  the  morning  of 
the  appointment.  A  blood  sample  was  taken 
from  each  eligible  woman  on  the  appointment 
day  and  she  was  given  a  bottle  containing  seven 
capsules  of  placebo  or  I3C.  One  week  later,  for  a 
total  of  4  weeks,  a  first  morning  urine  sample 
and  a  blood  sample  were  collected,  and  a  refill 
was  dispensed  for  the  following  week.  Because 
no  reliable  biochemical  tests  for  I3C  metabo¬ 
lites  are  available,  compliance  monitoring  was 
carried  out  by  both  pill  count  and  an  interview. 

STATISTICAL  METHODS 

In  the  dose-ranging  analysis,  the  level  of  per¬ 
turbation  of  the  SEB,  namely  the  urinary  estro¬ 
gen  ratio,  at  any  time  point  was  expressed  as 
relative  change  from  baseline.  For  each  dose 
group,  including  the  placebo  group,  the  peak 
relative  change  (PRC)  over  the  4-week  trial 
period  was  obtained  for  each  woman,  and  the 
mean  of  the  PRC  was  used  to  estimate  the  peak 
relative  perturbation  for  the  particular  dose 
group  over  the  trial  period.  We  remark  that  a 
more  sensitive  approach  utilizing  a  parametric 
statistical  model  was  not  feasible  here  because 
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the  individual  response  profiles  could  not  be 
summarized  by  a  simple  parametric  curve  (for 
instance,  a  sigmoidal  curve).  The  estimated  PRC 
from  each  dose  group  was  then  plotted  against 
a  dose  of  I3C  to  search  for  an  MED.  Our  dose¬ 
ranging  study  suggests  a  clear  dichotomy  of 
response  between  a  low-dose  group  involving 
50,  100,  and  200  mg,  and  a  high-dose  group 
involving  300  and  400  mg;  therefore,  paramet¬ 
ric  model  fitting  at  this  stage  of  dose-ranging 
study  to  identify  an  MED  was  not  necessary. 
Comparisons  of  PRC  among  the  dose  groups 
were  adjusted  for  confounding  factors  using 
linear  regression.  To  ensure  no  serious  statisti¬ 
cal  biases  were  introduced  into  the  dose-rang¬ 
ing  analysis  due  to  non-randomness  of  dose 
assignment,  distributions  of  various  factors  that 
could  contribute  to  biases  were  compared  across 
the  three  dose  groups. 

FOOD  ITEM  ANALYSIS 

Every  participant  was  required  to  complete  a 
simple  food  intake  questionnaire  regarding  her 
eating  habits  in  the  past  3  months  preceding 
her  initial  interview  for  the  intervention  trial. 
Both  the  frequency  and  the  serving  size  of  a 
variety  of  vegetables,  including  most  known 
I3C-rich  vegetables,  were  recorded  for  each 
woman.  A  numeric  score  representing  the  total 
monthly  consumption  of  a  specific  vegetable 
item  was  calculated  from  the  food  intake  data. 
Assuming  equal  weight  for  every  vegetable  item, 
we  derived  for  each  woman  a  I3C  vegetable 
consumption  score  and  the  proportion  of  I3C 
vegetables  in  the  total  vegetables  consumed 
averaging  over  a  month.  Data  from  a  total  of  54 
participants  were  available  for  such  a  food  in¬ 
take  analysis.  Both  the  I3C  score  and  the  pro¬ 
portion  of  I3C  vegetable  consumption  were  not 
significantly  related  to  baseline  urinary  estro¬ 
gen  ratio. 

TOXICITY  ANALYSIS 

Clinical  chemistry  and  complete  blood  counts 
were  determined  from  the  blood  samples  col¬ 
lected  at  baseline  and  at  the  end  of  each  of  the  4 
consecutive  weeks  of  trial.  Any  parameter  whose 
measured  value  was  outside  the  normal  range 
was  investigated  for  possible  toxicity.  Except 
for  two  participants  who  had  unexplained  small 
increases  in  the  liver  enzyme  SGPT  level  (43  to 
65,  and  30  to  71),  no  other  toxicity  effects  were 
encountered. 


DOSE-RANGING  ANALYSIS 

A  total  of  57  women  were  evaluable  for  the 
entire  dose-ranging  study.  Except  for  three 
women  from  New  Jersey,  all  of  the  57  women 
were  from  the  New  York  metropolitan  area. 
Fifty-two  (91%)  of  the  women  were  white.  Forty- 
six  (81%)  were  college  educated,  and  twenty- 
four  (42%)  completed  graduate  studies.  The 
average  age  of  the  participants  was  46.7  years 
(range  22-74).  The  average  age  at  menarche 
was  12.4  years  (range  8-18).  Forty  (70%)  of  the 
women  were  pre-menopausal,  and  38  (67%)  of 
the  women  have  been  pregnant  at  least  once. 

Figure  1  displays  the  sample  mean  relative 
change  of  the  estrogen  ratio  from  baseline  over 
time  for  the  control  group  (n  =  10),  50  mg  group 
(n  =  7),  100  mg  group  (n  =  10),  200  mg  group 
(n  =  10),  300  mg  group  (n  =  10),  and  400  mg 
group  (n  =  10).  The  profiles  suggest  a  segrega¬ 
tion  of  the  treated  groups  into  a  low-dose  group 
(LD)  consisting  of  women  in  the  50,  100,  and 
200  mg  groups,  and  a  high-dose  (HD)  group 
consisting  of  women  in  the  300  and  400  mg 
groups.  Moreover,  the  plots  also  suggest  that 
the  control  group  (C)  was  not  significantly  differ¬ 
ent  from  the  LD  group.  For  the  sake  of  statisti¬ 
cal  power,  the  dose-ranging  analysis  hereafter 
will  compare  data  from  the  C,  LD,  and  HD  groups. 

Before  we  can  statistically  compare  the  levels 
of  perturbation  of  the  SEB  in  the  three  groups, 
we  have  to  rule  out  the  presence  of  any  statisti¬ 
cal  bias  due  to  non-randomness  of  dose  assign¬ 
ments  to  the  participants.  To  this  end,  we  exam¬ 
ined  the  distributions  of  a  number  of  potential 
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Fig.  1.  Mean  relative  change  of  urinary  estrogen  ratio  profile 
plots  for  the  control  and  five  dose  groups. 
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confounding  factors,  including  age,  age  at  men- 
arche,  baseline  estrogen  ratio,  menopausal  sta¬ 
tus,  pregnancy  history,  and  educational  level. 
No  significant  differences  were  found  across  the 
three  groups  with  respect  to  such  factors. 

Within  each  of  the  three  groups,  we  identified 
the  PRC  for  each  of  the  participants  in  the 
group  and  calculated  the  usual  95%  confidence 
interval  (Cl)  for  the  population  mean  PRC  for 
the  group.  Figure  2  presents  the  individual 
PRC  values  and  the  Cl  for  each  group.  There 
was  no  significant  difference  in  mean  PRC  be¬ 
tween  C  and  LD.  The  sample  mean  ±  SD  of 
PRC  for  LD  was  0.33  ±  0.36  and  that  for  HD 
was  0.81  ±  0.57;  the  difference  of  0.48  was 
significant  at  P  -  0.001  by  the  two-sample 
i-test.  The  95%  Cl  for  the  difference  in  mean 
PRC  between  the  HD  and  LD  groups  was  esti¬ 
mated  to  be  (0.22, 0.76). 

The  perturbation  results  were  unadjusted  for 
any  confounding  factors.  Menopausal  status  was 
a  major  concern  in  the  comparison.  Figure  3a, b 
shows  that  within  each  of  HD  group  and  C  + 
LD  group,  there  was  no  significant  difference  in 
mean  relative  change  of  the  SEB  from  baseline 
between  pre-menopausal  and  post-menopausal 
women  over  the  entire  trial  period.  The  same 
conclusion  was  true  for  comparison  based  on 
PRC.  Besides  menopausal  status,  we  also  in¬ 
cluded  age,  age  at  menarche,  baseline  estrogen 
ratio,  and  educational  level  in  a  multivariate 


Control  Low  High 

(N-10)  (N*27)  (N*20) 


Fig.  2.  Comparison  of  peak  relative  change  of  urinary  estrogen 
ratio  among  control,  low-  and  high-dose  groups.  Difference 
between  the  high-dose  group  and  the  other  two  dose  groups, 
unadjusted  for  confounding  factors,  was  significant  at  P  =  0.05. 


(a)  Control  and  Low  Dose 


(b)  High  Dose 


Tim*  In  W**k» 

Fig.  3.  a,b:  Mean  relative  change  of  urinary  estrogen  ratio 
profile  plots  stratified  by  menopausal  status.  Vertical  bars  repre¬ 
sent  usual  95%  confidence  intervals  for  the  mean.  No  signifi¬ 
cant  difference  in  mean  relative  change  between  pre-meno¬ 
pausal  and  post-menopausal  women  was  established  within 
both  control  +  low-dose  group  and  high-dose  group. 

regression  to  attempt  to  explain  the  variation 
in  the  observed  PRC.  For  the  C  +  LD  group,  the 
variation  in  PRC  could  only  be  explained  by 
random  inter-participant  differences.  However, 
for  the  HD  group,  about  50%  of  the  total  varia¬ 
tion  in  PRC  was  significantly  explained  by  a 
regression  towards  the  mean  effect  of  baseline 
estrogen  ratio  ( P  =  0.001).  Figure  4  displays 
the  linear  relationship  between  PRC  and  base¬ 
line  ratio  for  the  HD  group,  and  the  lack  of 
correlation  in  the  case  of  the  C  +  LD  group. 

From  regression  analysis,  we  found  a  signifi¬ 
cant  adjusted  difference  in  PRC  between  the 
two  groups  as  long  as  baseline  estrogen  ratio 
was  less  than  2.92.  Table  I  tabulates  the  differ¬ 
ences  and  the  corresponding  95%  CIs  for  some 
selected  values  of  baseline  ratio. 

DISCUSSION 

The  goal  of  this  placebo-controlled,  double¬ 
blind  study  was  to  determine  a  minimum  effec- 
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■  High  Dose  +  Low  Dose  -  Control 


Fig.  4.  Plot  of  peak  relative  change  of  urinary  estrogen  ratio  vs.  baseline  value.  Linear  regression  was  significant  only  in  the  high-dose 
group:  y  =  1 .38-0.29x,  P  =  0.001 ,  R2  =  0.50.  For  control  +  low-dose  group,  mean  peak  relative  change  mSD  =  0.32  ±  0.33. 


tive  and  safe  dose  schedule  of  I3C  that  will 
result  in  a  significant  increase  in  the  urinary 
estrogen  metabolite  ratio  of  2-hydroxyestrone 
to  16a-hydroxy  estrone.  We  have  shown  in  a 
sample  of  57  women  that  an  appropriate  choice 
of  MED  was  300  mg  and  that  daily  intake  of 
I3C  at  this  dose  presented  no  significant  toxic¬ 
ity  in  a  4-week  trial.  At  this  MED  dose  sched¬ 
ule,  peak  relative  change  of  the  estrogen  me¬ 
tabolite  ratio  was  significantly  greater  than 
that  at  the  lower  doses,  and  the  difference  was 
more  pronounced  for  women  with  lower  base¬ 
line  ratios.  However,  there  was  no  significant 
perturbation  of  the  biomarker  for  women  with 
high  and  presumably  already  protective  base¬ 
line  ratios.  Menopausal  status  was  not  a  signifi¬ 
cant  factor  for  perturbation  of  the  biomarker  in 
our  analysis,  although  there  was  a  trend  to¬ 
wards  greater  up-regulation  of  the  ratio  in  the 


TABLE  I.  Adjusted  Differences  in  PRC  of 
Urinary  Estrogen  Ratio  Between  High-Dose 
Group  and  Combined  Control  and  Low-Dose 
Group* 


Baseline  ratio 

Adjusted 

difference 

95%  Cl 

Lower  Upper 

P  value 

Q1  =  1.41 

0.65 

0.44 

0.88 

<0.001 

M  =  2.01 

0.48 

0.3 

0.67 

<0.001 

Q3  =  2.66 

0.29 

0.1 

0.49 

0.004 

C  =  2.92 

0.22 

0 

0.43 

0.05 

*Q1,  M,  and  Q3  represent  the  first  quartile,  median  and 
third  quartile  of  baseline  ratio,  respectively.  C  represents 
the  critical  baseline  ratio  beyond  which  there  was  so  signifi¬ 
cant  difference  in  PRC  between  the  two  groups. 


case  of  pre-menopausal  women.  A  larger  study 
should  be  conducted  to  confirm  the  findings 
reported  here,  particularly  the  lack  of  effect  of 
menopausal  status  on  the  perturbation  of  the 
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biomarker,  and  to  identify  an  optimal  effective 
dose  schedule  for  a  long-term  breast  cancer 
prevention  trial. 
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ftom  jt,  gO  »s  a  known  link  function  and  fi)  is  a  function  in  u.  Conventionally,  u  are  some  metrically  scaled  variables  while  x  are  qualitative 

jepessots. 

Parameter  estimation  in  varying  coefficient  models  can  be  done  by  a  local  likelihood  approach  (Tibshirani  &  Hastie  [  J.A.S.A.  82 
«ofl7v559— 568]  which  is  directly  feasible  by  standard  software  for  generalized  linear  models.  Theoretical  results  yield  asymptotic  consistency  of 
J^nates  and  allow  for  asymptotic  pointwise  confidence  bands.  Moreover,  a  direct  correction  of  the  estimation  bias  is  available  by  usmg  a  simple 

fisher  scoring  routine. 

The  theoretical  results  are  supported  by  simulations  and  real  data  examples.) 


126.  survival  ANALYSIS  I 

ON  SEMIPARAMETRIC  RANDOM  CENSORSHIP  MODELS 
Gerhard  Dikta 

Fachhochschule  Aachen,  Abteilung  JVulich,  Ginsterweg  1,  D  -  52428  JVulich,  Germany  DIKTA@FHSERVER03.DVZ.FH-AACHEN.DE 

In  the  random  censorship  model  one  observes  data  of  the  form  (Z,6)  where  Z=(X,  Y),  X  is  independent  of  T,  and  6  indicates  whether  X 
.  censored  (6=0)  or  not  (6=1).  Denote  by  m(Jc)=/F(6\Z  =  Jt)  the  regression  function  of  the  binary  datum  6  given  Z  -  x  and  assume  that  m  belongs 
to  a  parametric  family  with  parameter  space  0c  7R*.  i.e.  m(x)  =  m(x,  ft)  and  ft  e  0.  We  propose  a  semiparametric  estimator  of  the  distribution 
function  F  of  X,  denoted  by  F„,  which  is  based  upon  maximum  likelihood  estimation  of  0o,  and  which  generalizes  the  Cheng  and  Lin  estimator  in 
the  proportional  hazards  model.  We  establish  uniform  consistency  and  a  functional  central  limit  result  for  F„  which  is  compared  to  that  of  the 
Kaplan-Meier  estimator. 

VARIANCE  OF  THE  MLE  OF  A  SURVIVAL  FUNCTION  WITH  DOUBLY-CENSORED  DATA 
-  Qiqing  Yu,  Linxiong  Li  and  George  Wong 

Qiqing ,  SUNY  at  Binghamton,  University  of  New  Orleans,  and  Strang  Cancer  Preventive  Institute 


The  asymptotic  properties  of  the  nonparametric  MLE  or  the  self-consistent  estimator  of  a  survival  function  with  doubly-censored  data  have 
been  studied  by  many  authors.  However,  to  date,  it  is  not  clear  from  the  literature  how  to  produce  an  estimate  of  the  asymptotic  variance  of  the  MLE 
of  $S(t)$  with  doubly-censored  data,  even  though  the  existence  of  such  asymptotic  variance  has  been  proved,  with  an  abstract  fonn  in  the  Banach 
space  (Gu  and  Zhang  (1993)).  We  present  the  explicit  expressions  of  the  asymptotic  variance  of  the  generalized  MLE  and  its  estimator. 

Simulation  study  indicates  that  the  approximation  is  close  even  with  sample  size  $n=100$  and  the  probability  of  censoring  is  $85\%$. 


DOUBLE  CENSORING:  CHARACTERIZATION  AND  COMPUTATION  OF  THE  NONPARAMETRIC  MAXIMUM  LIKELIHOOD 

ESTIMATOR 


Jon  A.  Wellner  and  Yihui  Zhan 

Yihui  Zhan,  University  of  Washington.  Department  of  Statistics,  Box  354322,  Seattle,  WA  98195.  Email:  zhan@statwashington.edu 


KEY  WORDS:  Double  Censoring,  NPMLE,  ICM  Algorithm,  Hybrid  Algorithm 

While  the  likelihood  equations  have  a  unique  solution  in  the  case  of  right  censored  data,  this  is  no  longer  the  case  for  doubly  censored 
data:  the  likelihood  equations  may  have  multiple  solutions  in  the  case  of  double  censoring.  Algorithms  such  as  the  EM  algorithm  designed  to 
calculate  one  solution  of  the  likelihood  equations  may  converge  to  a  self-consistent  estimate  other  than  the  NPMLE.  The  ambiguity  of  the  EM 
algorithm  jn  calculating  the  NPMLE  for  doubly  censored  data  and  its  known  slow  convergence  rate  pose  real  difficulties  in  applications,  especially 
when  bootstrap  methods  are  used  for  inference. 

In  this  paper  we  present  a  characterization  of  the  NPMLE  for  doubly  censored  data.  The  NPMLE  is  characterized  as  the  left-derivative 
of  a  convex  minorant  formed  by  derivatives  of  likelihood  function.  The  NPMLE  is  shown  to  be  one  of  the  self-consistent  estimates  maximizing  the 
likelihood  function.  Based  on  the  characterization,  we  propose  a  new  hybrid  algorithm  that  utilizes  a  composite  algorithmic  mapping  of  the  EM 
algorithm  and  the  moriifieri  ICM  algorithm.  Numerical  simulations  demonstrate  that  the  hybrid  algorithm  converges  to  the  NPMLE  more  rapidly 
than  either  of  the  EM  or  the  naive  ICM  algorithm  for  doubly  censored  data. 

PROPERTIES  OF  TEST  STATISTICS  APPLIED  TO  RESIDUALS  IN  FAILURE  TIME  MODELS 

Inmaculada  B.  Aban,  Edsel  A.  Pena 

Inmaculada  B.  Aban,  Department  of  Math  (084),  University  of  Nevada  Reno,  Reno,  NY  89557 


KEY  WORDS:  Generalized  Residual  Process,  Goodness-of-Fit,  Model  Validation 

Asymptotic  properties  of  a  class  of  test  statistics  when  applied  to  hazard-based  residuals  arising  in  survival  and  reliability  models  will 
be  presented.  These  test  statistics  are  useful  in  goodness-of-fit  testing  and  model  validation.  The  properties  are  obtained  by  examining  the  asymptotic 
properties  of  generalized  residual  processes,  which  are  (possibly  random)  time-transformations  of  the  processes  associated  with  die  incomplete  failure 
times.  Since  the  time-transformations  depend  on  unknown  model  parameters,  the  residual  processes  are  obtained  by  replacing  the  unknown 
parameters  by  their  estimators.  The  results  therefore  shed  light  on  the  effects  of  estimating  parameters  to  obtain  the  residual  processes.  Implications 
concerning  possible  pitfalls  of  some  existing  model  validation  procedures  utilizing  hazard-based  residuals  and  ways  to  correct  these  problems  will 
be  discussed. 
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320  The  Behaviour  of  the  Maximum  Likelihood  Estimator  as  a  Process  and 
Some  Applications 

Robert  M.  LOYNES 
University  of  Sheffield,  England. 

Given  a  set  of  observations,  supposedly  either  independent  and  identically  distributed  or  from 
a  stationary  AR  process,  whose  distribution  contains  a  fixed-dimension  unknown  parameter 
the  behaviour  of  the  maximum-likelihood  estimator  (MLE)  as  a  function  of  the  number  of 
observations  used  contains  evidence  about  whether  the  model  assumptions  are  satisfied,  or 
whether  a  change  of  regime  or  drift  is  taking  place.  A  weak  convergence  result  for  the  process 
of  MLEs  is  given,  which  allows  various  tests  to  be  constructed.  [A  Prof  R.M.  Loynes,  University 
of  Sheffield,  Probability  &  Statistics  Section,  School  of  Maths  &  Stats,  Sheffield  S3  7RH  UK- 
R.LOYNES@SHEFFIELD.AC.UK.] 

321  Comparing  Groups  with  Irregular  Longitudinal  Data 

J.  S.  MARITZ 

Medical  Research  Council,  South  Africa. 

Longitudinal  data  arise  when  observations  of  a  dependent  variable  are  made  at  several  suc¬ 
cessive  time  points.  When  such  data  are  recorded  for  a  number  of  subjects  it  often  happens 
that  the  time  configuration  varies  from  subject  to  subject,  producing  irregular  longitudinal 
data.  Comparison  of  two  or  more  groups  of  subjects  is  considered  using  exact  permutational 
methods.  This  entails  choosing  appropriate  descriptive  and  test  statistics  and  generating  their 
exact  distributions.  [A  J.  S.  Maritz,  MRC-CERSA,  PO  Bo  19070,  Tygerberg  7505,  South  Africa- 
SMARITZ@EAGLE.MRC.AC.ZA.] 

322  Repeated  Ordinal  Responses 

Rory  St  John  WOLFE 
Southampton  University ,  UK. 

An  approach  to  modelling  repeated  ordinal  responses  is  discussed.  This  involves  using  'scaling' 
terms  in  a  cumulative  logit  model  [McCullagh  J.  Roy.  Statist.  Soc.Ser.B  42 1980:109-142].  The  ap¬ 
proach  is  applied  to  data  from  telecommunication  experiments.  A  new  general  purpose  method 
of  fitting  the  model  in  GLIM4  is  introduced.  Finally  the  consideration  of  a  random-effects 
model  is  discussed.  [A  Rory  Wolfe,  Maths  Department,  Southampton  University,  Highfield 
Southampton,  S017  1BJ,  UK;  RW@MATHS.SOTON.ACUK.] 

323  On  Minimum  Distance  Estimation  of  Location  Parameter  for 
Interval  Censored  Data 

>  Vasudaven  MANGALAM 

Curtin  University,  Perth,  Australia. 

Let  Xi ,  X2 , . Xn  be  i.i.d.  with  distribution  function  given  by  F ( x  —  a)  where  F  is  an 

unknown  symmetric  distribution  and  a  is  an  unknown  location  parameter.  7\,T2,  ....Tn  are 
i.i.d.  and  independent  of  X[s  with  unknown  distribution  G.  We.observe  {Tit d-),  t’=  1,  n 
where  dx  is  the  indicator  of  whether  X,  is  less  than  or  equal  to  T{.  A  minimum  distance 
estimator  is  constructed  for  the  parameter  and  the  properties  are  studied.  Two-sample  extension 
to  this  is  also  considered.  [A  Vasudaven  Mangalam,  School  of  Mathematics  and  Statistics, 
Curtin  University  of  Technology,  GPO  Box  U1987,  Perth  WA  6001;  VASU@CS.CURTIN.EDU.AU.] 

324  Variance  of  the  MLE  of  a  Survival  Function  with  Interval  Censored  Data 

/  Qiqing  YU 
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University  of  New  Orleans. 
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gated  between  test  poweT the  relati°nship  is  investi- 
theoretic  definitions  are  used  to  quantify  the  notion  of  aPProaches  zero-  Measure- 
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(9:50)  DENSITY  ESTIMATION  FOR  A  CLASS  OF  STATIONARY 
NONLINEAR  PROCESSES.  Kamal  C.  Chanda,  Texas  Tech  U 
(10:05)  FLOOR  DISCUSSION 


College  Bowl— 10:30  a.m.  -  12:20  p.m. 


229  H-California  B 

COLLEGE  BOWL  SEMIFINALS  AND  FINALS 

Mu  Sigma  Rho 

Organizers:  Don  Edwards,  U  of  South  Carolina;  Mark  E.  Payton, 
Oklahoma  State  U 

Chair:  Mark  E.  Payton,  Oklahoma  State  U 
Emcee:  George  Casella,  Cornell  U 
Scorekeeper:  Bruce  Collings,  Brigham  Young  U 
Teams:  Winners  of  Quarterfinals  (Session  146) 


Invited  Sessions— 10:30  a.m.  - 12:20  p.m. 


230  H-California  C 

CLASSIFICATION  OF  RACE  AND  ETHNICITY:  A  DISCUSSION— 
Invited  Panel 

Council  of  Professional  Assn  on  Fed  Scat,  Sec.  on  Epidem.,  Govt.  Stat.  Sec,  Sec. 
on  Hltk  Policy  Stats.,  Social  Stat.  Sec 
Chair/Organizer:  Edward  J.  Spar,  COPAFS 
Panelists:  Katherine  K.  Wallman,  Office  of  Mgmt  &  Budget 
Thomas  Sawyer,  US  House  of  Representatives 
Linda  Gage,  California  State  Finance  Dept 
Margo  Anderson,  U  of  Wisconsin-Milwaukee 
Roderick  J.  Harrison,  US  Bur  of  the  Census 


232  H-Hum 
&  APPLIED  ORDER-RESTRICTED  INFERENCE— Invited  1 

ASA  Council  of  Chapters 

Organizer:  Qing  Liu,  Food  &  Drug  Admin 

Chair:  Roslyn  A.  Stone,  U  of  Pittsburgh 

(10:35)  TESTING  EQUALITY  OF  SURVIVAL  CURVES  UNDEF 

CONSTRAINTS.  Tim  Wright,  U  of  Missouri;  Anura  Abeyratne, 

Co;  Bahadur  Singh,  U  of  Missouri 

(11:00)  ORDERED  INFERENCE  IN  CLINICAL  TRIALS  WITH 
PLE  ENDPOINTS.  Dei-In  Tang,  Nathan  Kline  Inst  for  Psych  Re) 
(11:25)  ORDER-RESTRICTED  INFERENCE  IN  2X2  TABLES  Y 
HETEROGENEOUS  ODDS  RATIOS.  Qing  Liu,  Food  &  Drug , 
(11:50)  Disc:  Jon  H.  Lemke,  U  of  Iowa 
(12:10)  FLOOR  DISCUSSION 

233  M-Orange 
$  PRACTICAL  MARKOV  CHAIN  MONTE  CARLO— Inviu 

Sec.  on  Bayesian  Stat.  Sci.,  ENAR,  WNAR,  IMS,  Bio.  Sec.,  Bus.  &  E 
Sec,  Stat.  Comp.  Sec 

Organizer:  James  H.  Albert,  Bowling  Green  State  U 
Chair:  Robert  E.  Kass,  Camegie-Mellon  U 
Panelists:  Bradley  P  Carlin,  U  of  Minnesota 
Andrew  Gelman,  Columbia  U 
Minghui  Chen,  Worcester  Polytechnic  Inst 


234  H-Palos 
^MATCHING  AND  CONDITIONAL  INDEPENDENCE:  N 
DEVELOPMENTS  IN  TESTING  AND  ESTIMATION— Invi 
Papers 

Bus.  &  Econ.  Stat.  Sec 

Organizer:  James  J.  Heckman,  U  of  Chicago 
Chair:  Robert  Moffitt,  Johns  Hopkins  U 
(10:35)  CONDITIONAL  INDEPENDENCE  RESTRICTIONS:  T 
AND  ESTIMATION.  Oliver  Linton,  Yale  U;  Pedro  Gozalo,  Brov 
(11:05)  MATCHING  AS  AN  ECONOMETRIC  ESTIMATOR.  H 
Ichimura,  U  of  Pittsburgh;  Petra  Todd,  U  of  Pennsylvania 
(11:35)  ALTERNATIVE  METHODS  FOR  EVALUATING  SOCU 
PROGRAMS:  THEORY  AND  EVIDENCE.  James  J.  Heckman,  l 
Chicago 

(12:05)  FLOOR  DISCUSSION 

235  H-El  Ca 
©JUDGEMENT  IN  OFFICIAL  STATISTICS:  HOW  EXPLIC 
SHOULD  WE  BE?— Invited  Papers 

Govt.  Stat.  Sec. ,  Social  Stat.  Sec. 

Chair/Organizer:  Michael  A.  Stoto,  Nat’l  Academy  of  Sciences 
Panelists:  Jaime  Marquez,  Federal  Reserve  Board 

Francisco  J.  Samaniego,  U  of  Califomia-Davis 
Joseph  Sedransk,  Case  Western  Reserve  U 
Carl  N.  Morris,  Harvard  U 


231  M-Grand  C/D 

INTERACTIONS  BETWEEN  UNIVERSITY  GRADUATE 
PROGRAMS  AND  FOUR-YEAR  COLLEGES— Invited  Papers 

ASA  Cmte  on  Career  Development,  Sec.  on  Stat.  Educ,  Sec.  on  Teaching  of 
Stat.  in  Hlth.  Sci. 

Chair/Organizer:  Rosemary  A.  Roberts,  Bowdoin  College 
(10:35)  FOUR-YEAR  COLLEGES  AS  A  SOURCE  OF  GOOD  GRADUATE 
STUDENTS.  Thomas  L.  Moore,  Grinnell  College;  Dean  Isaacson,  Iowa 
State  U 

(1 1:00)  RECRUITING  A  STATISTICIAN  AT  A  FOUR-YEAR  COLLEGE. 
Gudmund  Iversen,  Swarthmore  College;  Philip  J.  Everson,  Swarthmore 
College 

(11:25)  PREPARING  GRADUATE  STUDENTS  TO  TEACH  STATISTICS 

AT  A  FOUR-YEAR  COLLEGE.  William  I.  Notz,  Ohio  State  U;  Ann  R. 

Cannon,  Cornell  College 

(11:50)  Disc:  Lynne  Billard,  U  of  Georgia 

(12:10)  FLOOR  DISCUSSION 


236  H-Huntii 

&LOST  IN  SPACE:  ASSESSING  MULTIVARIATE  MISSINc 
DATA — Invited  Papers 

Sec.  on  Stat.  Graph.,  ENAR,  WNAR,  Bio.  Sec.,  Sec  on  Hlth.  Policy 
Organizer:  Dianne  H.  Cook,  Iowa  State  U 
Chair:  Hal  S.  Stem,  Iowa  State  U 

(10:35)  SENSITIVITY  OF  ANALYSES  WITH  MULTIVARIATE  : 
DATA  IN  STUDIES  OF  THE  ELDERLY.  Robert  J.  Glynn,  Brigh; 
Women’s  Hosptial 

(11:05)  MISSING  DATA  IN  INTERACTIVE  HIGH-DIMENSIO 
DATA  VISUALIZATION.  Deborah  F  Swayne,  Bellcore;  Andreas 
Labs,  Lucent  Technologies 

(1 1:35)  CAN  WE  SEE  WHAT  ISNT  THERE?  EXPLORING  AN 
KEEPING  TRACK  OF  MISSINGS.  Antony  Unwin,  Heike  Hofir 
Augsberg 

(12:15)  FLOOR  DISCUSSION 


O  =  Theme  session 


ICSA  1997  Applied  Statistics  Symposium 

May  30  -  June  1,  1997 

Rutgers  University,  New  Jersey,  USA 

Title:  Asymptotic  Properties  Of  Self-Consistent  Estimators  of  A  Survival  Function 
by  Qiqing  Yu  and  George  Y.  C.  Wong. 

SUNY  at  Binghamton  and  Strang  Cancer  Prevention  Center 

ABSTRACT:  The  asymptotic  properties  of  the  nonparametric  maximum  likelihood 
estimator  and  other  estimators  of  a  joint  distribution  function  F  of  a  bivariate  random 
vector  X  with  right-censored  data  have  been  studied  by  several  authors.  Among  others,  an 
important  assumption  made  in  their  studies  is  that  X  lives  on  a  rectangle  region  [0,  a]  x  [0,  &] 
which  can  be  observed.  However,  in  many  follow-up  studies,  a  —  b  =  L  is  the  length  of  the 
study  period  and  X  lives  on  a  region  larger  than  [0,  L]  x  [0,  L\.  Thus  it  is  of  interest  to 
study  whether  the  asymptotic  results  established  by  these  authors  are  still  valid  without 
that  restriction.  In  this  direction,  we  established  the  strong  consistency  of  self-consistent 
estimators  of  a  discrete  distribution  function. 
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July  30, 1997 

George  Y.C.  Wong  PhD 
Strang  Cancer  Prevention  Ctr 
428  E  72  St 
New  York  NY  10021 

RE:  A  dose-ranging  study  of  indole-3-carbinol  for  breast  cancer  prevention. 

Dear  Dr.  Wong: 

Your  abstract  referenced  above  has  been  accepted  as  a  poster  presentation  (Program  #  340)  for 
the  20th  Annual  San  Antonio  Breast  Cancer  Symposium  to  be  held  December  3-6,  1997. 
Enclosed  is  a  first  draft  copy  of  the  program  and  instructions  for  a  poster  presentation.  (Please 
review  the  instructions  carefully,  particularly  the  times  for  putting  up  and  removing  the  posters, 
since  our  schedule  is  veiy  tight.)  The  final  program  along  with  meeting  registration  and  hotel 
information  will  be  mailed  to  you  soon  --  you  must  still  register  for  the  meeting,  even  though 
your  abstract  has  been  accepted.  * 

If  for  any  reason  your  poster  will  not  be  presented,  please  notify  Ms.  Lois  Dunnington  as  early 
as  possible,  by  electronic  mail  (lois_dunnington@msmtp.idde.sad.org),  by  FAX  (210-949-5009), 
or  by  phone  (210-616-5912). 

When  you  check  in  at  the  Registration  Booth,  your  symposium  materials  will  be  given  to  you. 

We  look  forward  to  your  presentation  at  our  symposium. 

Sincerely, 

GARY  C.  CHAMNESS,  PhD 
Chairman,  Abstract  Selection  Committee 
Medical  Oncology 


Symposium  Directors: 

C.  Kent  Osborne,  M.D. 

Professor  of  Medicine  and  Chief 
Division  of  Medical  Oncology 
The  University  of  Texas 
Health  Science  Center  at  San  Antonio 


Charles  A.  Coltman,  Jr.,  M.D. 

President  and  CEO 

Cancer  Therapy  and  Research  Center 

Professor  of  Medicine 

The  University  of  Texas 

Health  Science  Center  at  San  Antonio 


Symposium  Coordinator: 

Lois  E.  Dunnington 

Cancer  Therapy  and  Research  Center 
812 2  Datapoint  Drive,  Suite  250 
San  Antonio,  Texas  78229  USA 
(210)  616-5912 
FAX  (210)  616-5981 
Internet  address: 

lois_dunnington@msmtp.idde.saci.org 


Sponsored  by: 

San  Antonio  Cancer  Institute 

Cancer  Therapy  and  Research  Center 

The  University  of  Texas 

Health  Science  Center  at  San  Antonio 
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337  RETINOID-INDUCED  GROWTH  SUPPRESSION  OF  NORMAL 
HUMAN  EPITHELIAL  CELLS  DOES  NOT  REQUIRE  ACTIVATION 
OF  RAR-DEPENDENT  GENE  TRANSCRIPTION.  Yang,  L-M.,  Ludes- 
Meyer,  J.,  Munoz-Medellin,  D.,  Kim,  H-T.,  Reddy,  P.,  Ostrowski,  J., 
Reczek,  P.,  and  Brown,  P.  Division  of  Oncology,  Dept,  of  Medicine, 
Univ.  of  Texas  Health  Science  Center,  San  Antonio,  TX,  Bristol-Myers- 
Squibb,  Albany,  NY. 

Retinoids  inhibit  the  growth  of  breast  cancer  cells  and  are  potential 
agents  for  cancer  treatment  and  prevention.  However,  the  mechanism  by 
which  retinoids  prevent  cancer  is  not  known.  The  present  studies 
investigated  the  mechanism  by  which  naturally  occuring  and  synthetic 
retinoids  inhibit  the  growth  of  normal  human  mammary  epithelial  cells 
(HMECs).  All  trans  retinoic  acid  (atRA)  and  9cisRA  both  inhibited  the 
growth  of  normal  (184  and  HMEC)  and  malignant  (MCF7  and  T47D) 
breast  cells.  We  investigated  whether  retinoids  inhibit  normal  breast 
growth  by  interfering  with  the  cell  cycle  or  inducing  apoptosis.  atRA 
treatment  caused  a  cell  cycle  block  (by  increasing  G0/G1  phase  by  20% 
and  decreasing  S  phase  by  50%)  and  did  not  induce  apoptosis.  To  explore 
the  mechanism  by  which  retinoids  suppress  cell  growth,  we  correlated  the 
growth  inhibitory  effects  of  retinoids  with  their  ability  to  activate  RAR- 
dependent  transcription  and  transrcpress  AP-1 -dependent  transcription  in 
breast  cells.  By  measuring  RAR -dependent  transcription  using  a  retinoid- 
responsive  reporter  and  AP-1 -dependent  transcription  using  an  AP-1 
responsive  reporter,  we  found  that  atRA  and  9cisRA  both  activated  RAR- 
dependent  transcription  in  184  normal  breast  cells  and  T47D  breast  cancer 
cells.  atRA  and  9cisRA  also  both  inhibited  AP-1  activity  in  T47D  cells, 
while  9cisRA,  but  not  atRA,  inhibited  AP-1  activity  in  184  cells.  Retinoid 
analogs  which  inhibit  AP-I  without  activating  RAR  were  then  used  to 
determine  whether  inhibition  of  AP-1  without  activation  of  RAR-dependent 
transcription  was  sufficient  to  inhibit  breast  cell  growth.  The  growth  of 
T47D  and  184  was  inhibited  by  these  anti-AP-1  retinoids.  These  results 
suggest  that  RAR-dependent  transcription  is  not  required  for  retinoid- 
induced  growth  suppression  of  breast  cells,  which  instead  may  be 
mediated  by  inhibition  of  AP-1.  Such  studies  investigating  the  molecular 
mechanism  by  which  retinoids  inhibit  breast  cells  growth  may  lead  to  the 
development  of  retinoid  analogs  for  breast  cancer  prevention. 


COO  A  multi-institutional  study  on  the  efficacy  of  prophylactic  mastectomy  in  patients 
with  Lobular  Carcinoma  in  Situ  (LCIS).  Mackarem  G*,  Hughes  KS,  Beny  D, 
Litten  JB,  Roche  C,  Veto  J,  Morris  A,  Turk  P,  Fraser  H,  Schnaper  L,  Friedman 
NB,  Winer  EP,  Shafir  M,  Wanebo  HJ,  Capko  D,  Pories  S,  Khan  S,  Kroener  J, 
Hawksworth  K,  Ting  P,  Barth  R.  Lahey  Hitchcock  Breast  Center,  Burlington,  MA 
01805. 


Background:  The  efficacy  of  prophylactic  mastectomy  has  not  been  adequately 
tested  and  yet  women  who  carry  BRCA1  and  BRCA2  mutations  are  being  offered 
this  procedure.  LCIS  provides  an  established  model  of  high  risk  for  breast  cancer. 
Published  studies  report  that  the  risk  of  developing  breast  cancer  in  women  with 
LCIS  aproaches  33%.  Our  objective  is  to  evaluate  the  efficacy  of  prophylactic 
mastectomy  and  to  estimate  lifetime  risk  reduction  from  this  procedure. 

Methods:  Retrospective  data  on  493  patients  with  LCIS  were  collected  from  14 
institutions.  Patients  with  the  diagnosis  of  LCIS  and  no  previous  or  synchronous 
DCIS  or  invasive  cancer  were  eligible.  99  patients  were  treated  with  bilateral 
mastectomy  (BMX),  74  patients  were  treated  with  ipsilateral  mastectomy  (1MX), 
and  320  patients  were  followed  after  initial  biopsy  (OBS).  Ten  year  actuarial 
disease  free  survival  (DFS)  was  calculated  and  compared  for  all  groups,  statistical 
significance  between  DFS  was  determined  using  the  Mantel-Cox  test 
Results:  17  patients  developed  an  ipsilateral  (1PSI),  12  a  contralateral  (CONT)  and 
1  a  bilateral  cancer,  median  time  to  recurrence  was  63  months.  One  patient  died 
with  distant  metastasis  in  the  OBS  group  at  79  months. 

Surgery  #  pts  FUf months!  1PSI  CONT  DFS  £ 

OBS  320  42  18(6%)  12(4%)  0.7694 

IMX  74  88  0  1  (1%)  0.9487  0.00001 

BMX  99  75  0  -  1.0000  0.00001 


16%  of  invasive  recurrences  were  node  positive. 

Conclusions:  (l)Prophy lactic  mastectomy  markedly  reduces  the  risk  of  cancer  in 
patients  with  LCIS.  (2)  Prophylactic  mastectomy  has  no  impact  on  survival  at  10 
year.  (3)  This  data  can  be  useful  when  extrapolating  results  to  patients  with  genetic 
predisposition. 


339  Dirt*1?  Dehydroepiandrosterone  (DHEA)  Exhibits  Strong 

Chemoprevcntive  Activity  But  Minimal  Therapeutic  Activity  In 
The  MNU  Induced  Rat  Mammary  Model  System. 

Lubet,  R.  A1,  Steele, V.E.1,  Kello$  G.J.\  Eto,I.2,  and  Grubbs,  C  V 
1-Chemoprevention  Branch  NCI,  Bethesda  MD;  2-  Dept.  Of  Nutrition 
Sciences,  Univ.  Of  Alabama  at  Birmingham 

Female  Sprague-Dawley  rats  (50  day  old)  administered  a  single  i.v. 
dose  of  MNU  exhibit  a  high  incidence  and  multiplicity  of  mammary 
tumors  by  100  days  of  age  .  Prior  studies  have  shown  that  DHEA  (120, 
600  and  2000  ppm  in  diet)  is  a  highly  effective  chemoprevcntive  agent  in 
this  model  decreasing  tumor  multiplicity  by  55,  90  and  98% 
respectively.  DHEA  doses  (i600  ppm)  causes  striking  hormonal 
changes  in  treated  rats  increasing  levels  of  androgens  and  estrogens 
while  simultaneously  interfering  with  normal  estrous  cycling  in  rats. 
Interestingly  DHEA  induced  proliferation  and  apparently  differentiation 
in  the  breasts  of  treated  rats.  Morphologically  the  changes  observed  in 
DHEA  treated  rats  appear  similar  to  those  that  occur  during  pregnancy. 
When  rats  were  treated  with  DHEA  when  their  first  palpable  tumors 
arose  (40-60  days  post  MNU)  a  decrease  in  the  appearance  of  “new” 
late  arising  palpable  tumors  was  observed.  However  DHEA  had 
minimal  effects  on  the  continued  growth  of  palpable  lesions. 


340  AyDOSE-RANGING  STUDY  OF  INDOLE-3-CARBINOL  FOR  BREAST 
j£ANCER  PREVENTION.  Wong  GYC\  Bradlow  L,  Sepkovic  D,  Mehl 
.^XS,  Mailman  J,  Osborne  MP,  Strang  Cancer  Prevention  Center,  New 
York,  NY,  10021 

Sixty  women  at  increased  risk  for  breast  cancer  were 
enrolled  in  a  placebo-controlled,  double-blind  dose-ranging 
chemoprevention  study  of  indole-3-carbinol  (I3C).  Fifty-seven  of 
these  women  with  a  mean  age  of  47  years  (range  22-74) 
completed  the  study.  Each  women  took  a  placebo  capsule  or  an 
I3C  capsule  daily  for  a  total  of  four  weeks;  none  of  the  women 
experienced  any  significant  toxicity  effects.  The  urinary  estrogen 
metabolite  ratio  of  2-hydroxyestrone  to  16a-hydroxyestrone,  as 
determined  by  an  ELISA  assay,  served  as  the  surrogate  endpoint 
biomarker  (SEB).  Perturbation  in  the  levels  of  SEB  from  baseline 
was  comparable  among  women  in  the  control  (C)  group  and  the 
50,  100,  200  mg  low  dose  (LD)  group.  Similarly,  it  was 
comparable  among  women  in  the  300,  400  mg  high  dose  (HD) 
group.  Regression  analysis  showed  that  peak  relative  change  of 
SEB  for  women  in  the  HD  group  was  significantly  greater  than  that 
for  women  in  the  C  and  LD  groups  by  an  amount  that  was 
inversely  related  to  baseline  ratio;  the  difference  at  the  median 
baseline  ratio  was  0.48  with  95%  confidence  interval  (0.30, 
0.67).  No  other  factors,  such  as  age  or  menopausal  status,  were 
found  to  be  significant  in  the  regression  analysis.  The  results  in 
this  study  suggest  that  I3C  at  a  minimum  effective  dose  schedule 
of  300  mg  per  day  is  a  promising  chemopreventive  agent  for 
breast  cancer  prevention.  A  larger  study  to  validate  these  results 
and  to  identify  an  optimal  effective  dose  schedule  of  I3C  for  long¬ 
term  breast  cancer  chemoprevention  will  be  necessary.  [Support: 
Tiger  Foundation  and  U.S.  Army  Medical  Research  and  Materiel 
Command  under  DAMD17-94-J-4332] 
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careful  designed  experiment  should  satisfy  P(X  €  B)  =  0.  Thus  this  should  not  be  a  concern. 
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501- W  EFFECTS  OF  OMEGA-3  AND 
OMEGA-6  POLYUNSATURATED 
FATTY  ACID(PUFA)  DIETS  ON 
INSULIN  LIKE  GROWTH  FAC-TOR 
METABOLISM  AND  RODENT 
BREAST  CANCER  DEVELOPMENT 
William  T.  Cave,  Jr. 

University  of  Rochester  School  of 
Medicine,  Department  of  Internal 
Medicine ,  Rochester,  NY 

502- W  EXPRESSION  OF  PEROXI¬ 
SOME  PROLIFERATOR-ACTI- 
VATED  RECEPTORS  IN  HUMAN 
AND  RODENT  MAMMARY  TISSUES 

Megan  Lemer,  Stan  Lightfoot,  Xiying 
Wu,  Daniel  Brackett,  Alan  Hollingsworth, 
and  Jeff  Gimble 
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Abstract 

Mixed  interval-censored  (MIC)  data  consist  of  n  pairs  of  observations  (Li,R\),  ...,  ( Ln,Rn ),  where 
-oo  <  Li  <  Rt  <  oo  for  all  i ,  Lk  =  Rk  and  0  <  Lj  <  Rj  <  oo  for  at  least  one  k  and  one  j.  The  survival 
time  X{  is  only  known  to  lie  between  Li  and  Ri ,  i  ~  1, 2, . . . ,  n.  Peto  (1973)  and  Turnbull  (1976)  obtained, 
respectively,  the  generalized  MLE  (GMLE)  and  the  self-consistent  estimator  (SCE)  of  the  distribution  func¬ 
tion  of  X  with  MIC  data.  In  this  paper,  we  introduce  a  model  for  MIC  data  and  establish  strong  consistency, 
asymptotic  normality  and  asymptotic  efficiency  of  the  SCE  and  GMLE  with  MIC  data  under  this  model 
with  mild  conditions. 
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distribution,  strong  consistency. 


1  Partially  supported  by  Army  Grant  DAMD17-99-1-9390. 

2  Partially  supported  by  BoRSF  Grant  RA-D-31. 


1.  Introduction 

Interval  censoring  refers  to  a  situation  in  which,  X,  the  time  to  occurrence  of  an  event  of  interest  is  only 
known  to  lie  in  a  half-open  and  half-closed  time  interval  (L,  R],  where  the  pair  (L,  R)  is  an  extended  random 
vector  such  that  -oo  <  L  <  X  <  R  <  oo.  Interval-censored  (IC)  data  may  occur  in  medical  follow-up 
studies  when  each  patient  had  several  visits  and  the  event  of  interest  was  only  known  to  take  place  either 
before  the  first  visit,  between  two  consecutive  visits,  or  after  the  last  one.  Thus  an  IC  data  set  may  consist 
of  strictly  interval-censored  (SIC)  observations  (Le.,0  <  L  <  R  <  oo),  and  right-censored  (R  =  oo)  and/or 
left-censored  ( L  =  -oo)  observations.  Examples  of  IC  data  can  be  found  in  cancer  research  and  AIDS  studies 
(see,  e.g.,  Finkelstein  and  Wolfe,  1985). 

Case  1  data  (or  current  status  data,  see  Ayer  et  ai.,  1955)  is  a  special  case  of  IC  data  when  each  patient 
had  only  one  visit.  Observations  in  a  case  1  data  set  are  either  left-censored  or  right-censored.  Doubly- 
censored  data  (see  Chang  and  Yang,  1987)  consist  of  case  1  data  and  uncensored  observations.  It  is  clear 
that  neither  case  1  data  nor  IC  data  contain  uncensored  observations.  Furthermore,  doubly-censored  data 
do  not  contain  SIC  observations.  A  data  set  may  be  a  mixture  of  uncensored  observations  and  IC  data  which 
contain  SIC  observations.  We  call  such  data  mixed  interval- censored  (MIC)  data. 

MIC  data  arise  in  clinical  follow-up  studies.  In  a  cancer  follow-up  study,  a  patient  whose  tumor  marker 
value  (for  instance,  CA  125  in  ovarian  cancer)  is  consistently  on  the  high  (or  low)  end  of  the  normal  range 
in  repeated  testing  is  usually  monitored  very  closely  for  possible  relapse.  If  such  a  patient  should  relapse, 
then  time  to  clinical  relapse  can  often  be  accurately  determined,  and  an  uncensored  observation  is  obtained. 
However,  if  a  patient  is  not  under  close  surveillance,  and  would  seek  help  only  after  some  tangible  symptoms 
of  the  disease  have  appeared,  then  time  to  relapse  most  likely  has  to  be  specified  to  be  within  the  dates  of 
two  successive  clinical  visits. 

Another  situation  in  which  MIC  data  can  occur  is  in  the  usual  right-censored  survival  analysis  where 
actual  dates  of  events  are  not  recorded,  or  missing,  for  a  subset  of  the  study  population,  and  can  be  established 
only  to  within  specified  intervals.  An  example  from  the  Framingham  Heart  Study  was  presented  by  Odell 
et  ai  (1992).  In  this  large-scale  longitudinal  heart  disease  study,  times  of  occurrence  of  coronary  heart 
disease  were  recorded  for  almost  every  participant.  However,  time  of  first  occurrence  of  the  coronary  heart 
disease  subcategory  angina  pectoris  was  only  recorded  for  about  20%  of  the  participants  who  suffered  from 
angina  pectoris,  and  may  be  specified  only  as  between  two  clinical  visits,  several  years  apart,  for  the  other 
participants. 

For  censored  data,  Peto  (1973)  proposed  a  Newton-Raphson  algorithm  to  obtain  the  generalized  MLE 
(GMLE)  of  the  cumulative  distribution  function  (cdf),  F.  Turnbull  (1976)  obtained  a  self-consistent  estima¬ 
tor  (SCE)  of  the  cdf  via  an  EM-algorithm.  A  detailed  discussion  of  more  efficient  algorithms  for  obtaining 
the  GMLE  is  given  in  Wellner  and  Zhan  (1997). 

For  IC  data,  Groeneboom  and  Wellner  (1992)  formulated  the  case  2  model;  Wellner  (1995)  formulated 
a  case  k  model,  where  k  >  1;  Schick  and  Yu  (1999)  modified  Wellner’s  case  k  model  by  further  assuming 
that  A:,  the  number  of  visits  by  a  patient  in  a  follow-up  study,  is  a  random  integer  and  the  observation  (L,  R) 
is  a  mixture  of  various  case  k  models. 

Various  asymptotic  distribution  results  of  the  GMLE  have  been  obtained  for  censored  data.  For  case 
1  model  the  GMLE  is  asymptotically  normally  distributed  (a.n.)  and  the  convergence  rate  is  n1/2  if  the 
underlying  censoring  distribution  is  discrete  (Yu  et  al ,  1998b),  but  the  GMLE  is  not  a.n.  and  the  conver¬ 
gence  rate  is  n1/3  if  cdfs  have  positive  derivatives  (Groeneboom  and  Wellner,  1992).  For  case  2  model  the 
GMLE  is  a.n.  with  rate  n1/2  if  the  censoring  vector  takes  on  finitely  many  values  (Yu  et  al .,  1998c),  and 
Groeneboom  and  Wellner’s  (1992)  conjecture  that  under  certain  smoothness  conditions  the  GMLE  has  a 
pointwise  convergence  rate  of  (nlnn)1//3.  For  more  recent  development  on  the  latter  conjecture,  we  refer  to 
Groeneboom  (1996)  and  Van  De  Geer  (1996)  . 

For  MIC  data,  several  models  have  been  proposed,  and  the  asymptotic  properties  of  the  GMLE  have 
been  investigated  under  the  assumptions  that  either  the  censoring  vector  takes  on  finitely  many  values 
(see  Petroni  and  Wolfe,  1994,  and  Yu  et  al  1998a),  or  the  censoring  and  survival  distributions  are  strictly 
increasing  and  continuous,  and  they  have  “positive  separation”  (see  Huang  (1999)). 

In  this  paper,  we  shall  use  the  model  in  Yu  et  al  (1998a)  to  establish  asymptotic  properties  of  the 
GMLE  based  on  MIC  data  under  the  assumption  that  all  underlying  distributions  are  arbitrary  with  some 
mild  conditions.  Since  a  GMLE  is  also  an  SCE  (but  an  SCE  may  not  be  a  GMLE;  see  Yu  et  a/.,  1998a), 
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and  our  proofs  basically  use  the  properties  of  SCEs,  we  shall  focus  on  the  asymptotic  properties  of  SCEs  for 
MIC  data.  The  main  results  are  given  in  Section  2.  The  consistency  result  is  proved  in  Section  3  and  the 

asymptotic  normality  result  is  proved  in  Section  4.  Some  detailed  proofs  of  lemmas  in  Sections  3  and  4  are 

relegated  to  Appendices  A  and  B. 

2.  Main  Results 

We  introduce  a  mixture  interval  censorship  model,  a  mixture  of  an  interval  censorship  model  and  a 
right  censorship  model,  to  characterize  MIC  data.  Assume  that  the  observed  pair  (L,  R)  is  generated  by  a 
two-stage  experiment.  Let  (T,  Z7,  V)  be  a  random  censoring  vector  and  K,  a  random  integer  taking  values  0 
and  2.  Assume  that  X  and  (/C,T,  U,  V)  are  independent.  In  the  first  stage,  a  value  of  K  is  selected,  then 
(L,  R)  corresponds  to  the  observation  from  a  right  censorship  model  if  K  =  0  and  from  a  case  2  model  if 
K  =  2,  Le., 

(T  n\-  i  (X’X)1(x<T)  + {T,co)1(X>t)  if  /C  =  0,  ,  , 

’  |  (-oo,  U)l(x<u)  +  (U,V)l(u<x<v)  +  (V,oo)l(x>y)  \itC  =  2, 

where  l^  is  the  indicator  function  of  the  set  A.  It  is  known  that  in  order  to  estimate  F,  we  only  need 
to  observe  (L,F)  (see  Peto  (1973)).  Thus,  in  our  model,  X,  /C,  [/,  V  and  T  may  not  be  observed.  Let 
7Tfc  =  P(K  —  k)  >  0,  k  =  0,  2,  and  7T0  +  7r2  =  1.  Denote  (Li,  Fi), . . . ,  (Ln,  Rn)  a  random  sample  from  (L,  F). 

Suppose  that  X,  (L,F),  (17,  U),  F,  U  and  V  have  cdfs  F,  Q,  G,  Gt,  G\j  and  Gy,  respectively.  Define 
r0  =  sup{x  :  F(x)  =  0},  rv  =  sup{z  :  Gy(x)  <  1},  rt  =  sup{x  :  Gt(x)  <  1}  and  r  =  inf{x  :  F(x)  — 
1  or  Gt{x)  =  1}.  Let  0  =  {h:  h  is  a  nondecreasing  function  from  [-00,00]  to  [0,1]  such  that  h(- 00)  =  0 
and  h( 00)  =  1}.  Each  solution  Hn  of  the  equation 

Hn(x)  =  f  ~  dQn(l, r)  +  [  dQn(l,r),  Hn  G  0  (2.2) 

Jl<x<r  **n{r)  ~  tln{l)  ,/r<x 

is  an  SCE  of  F  (Li,  Watkins  and  Yu,  1997),  where  Qn  is  the  empirical  version  of  Q. 

Theorem  2.1.  Let  Hn  be  a  solution  of  (2.2).  Suppose  that 

{AS1)  (a)  rv  <  rt,  and  (b)  if  F(rt—)  <  1  then  P{T  or  V  =  t*}  >  0. 

Then  lim  sup  \Hn(x)  -  F(x)|  =  0  a.s.  if  F(r)  =  1,  and  lim  sup  |i7n(x)  -  F(x) \  =  0  a.s.. 

n—too  x>0 

Remark  1.  A  counterexample  similar  to  that  in  Schick  and  Yu  (1999)  can  be  constructed  to  show  that  the 
GMLE  is  not  consistent  if  AS l.b  is  deleted  from  our  Theorem  2.1. 

In  clinical  follow-ups,  a  study  typically  lasts  for  a  certain  period  of  time.  Thus  it  is  often  true  that 
F(r-)  <  1.  In  this  regard,  Gentleman  and  Geyer  (1994,  Theorem  2)  claimed  a  vague  convergence  result, 
and  Huang  (1996,  Theorem  3.1)  claimed  a  uniform  strong  consistency  result  for  IC  data  or  case  1  data.  Schick 
and  Yu  (1999)  showed  that  both  theorems  as  stated  are  false  and  can  be  corrected  by  adding  assumption 
ASl.b  to  their  theorems. 

It  is  well  known  (see  Peto,  1973)  that  a  GMLE  Fn(t)  is  not  uniquely  determined  for  t  €  (L;,Fj)  if 
Li  <  Rj ,  (L{,Rj)  n  {Li,...,Ln,i?i,...,Fn}  ~  0  and  /x^((Lf,Fj])  >  0.  For  the  convenience  of  our  proof  of 
normality,  we  restrict  our  attention  to  the  following  SCEs: 

Hn  is  right  continuous,  Hn( oo)  =  1  and  Snn  C  {Fi,  ...,Fn}.  (2.3) 

Under  convention  (2.3)  the  GMLE  F  is  uniquely  determined.  However  there  are  still  SCEs  that  satisfy  (2.3) 
but  are  not  the  GMLE.  A  point  x  is  called  a  support  point  of  a  function  /  if  there  exists  a  sequence  of  points 
Xk  — >  x  such  that  \f(xk)  —  f{x) \  >  0.  Denote  Sf  the  set  of  all  support  points  of  f. 

Theorem  2.2.  Let  Hn  satisfies  (2.2)  and  (2.3).  Suppose  that  AS1  holds  and 
(AS2)  F(r)  >  0  and  (Sgv  U«Sgv)  C  Sf- 

.  Then  for  x  <  r,  yj n(Hn(x )  -  F(x))  converges  in  distribution  to  a  normal  variate. 

AS1  and  AS2  are  much  weaker  than  the  assumptions  made  in  Petroni  and  Wolfe  (1994),  Yu  et  al. 
(1998a)  and  Huang  (1999). 
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Remark  2.  In  a  follow-up  study,  each  patient  has  Af  visits,  where  Af  >1  is  a  random  integer  (rather  than 
assuming  that  each  patient  has  exactly  2  visits  (Af  =2 )  as  in  the  case  2  model).  The  inspection  times  are 
Y\  <  •  •  •  <  Ytf.  It  is  reasonable  to  assume  that  X  and  (Af,  {Y{  :  i  >  1})  are  independent.  Then,  on  the 
event  {Af  =  k},  modify  (U,  V)  in  (2.1)  as 


k 

(U,V)  =  (yi,ya)i(x<yi)  +  (n-x.njw.)  +  E(ir<-i>y<)1(  Yi-i<X<Yi),  (2.5) 

i= 2 

where  Yq  =  0.  Thus,  a  more  realistic  model  for  MIC  data  is  the  model  of  a  mixture  of  a  right  censorship 
model  and  a  modified  case  2  model  where  (U,V)  is  specified  by  (2.5),  instead  of  assuming  that  X  and  (U,V) 
are  independent.  This  model  includes  our  model  (2.1)  (in  which  AT  =  2  with  probability  one)  as  well  as 
Huang’s  model  (in  which  A f  is  a  fixed  positive  integer  and  T  =  oo).  It  is  reasonable  to  assume  that  Af,  the 
number  of  visits,  is  bounded.  In  such  a  model  the  proofs  of  Theorems  2.1  and  2.2  are  similar  to  the  proofs 
given  in  Sections  3  and  4-  Thus  it  suffices  to  study  model  (2.1). 


3.  Strong  Consistency 

We  shall  prove  Theorem  2.1.  To  this  end,  we  first  state  two  prehminary  results. 

Theorem  3.1.  Suppose  that  F  E  0,  F  is  right  continuous  and  H  is  a  solution  of 

-  L<r  wrmm'r) + Lm'T)‘ H  e  a  (3-1) 


Then  H(x)  =  F(x)  for  all  x  <  r  if  AS1  holds;  and  H(x)  —  F(x)  for  all  x  <  Tt  if 
(ASS)  F(rt)  <  1 ,  rv  <rt  and  F  =  F(rt)  on  [x0,  oo),  where  xQ  <  rt. 

In  (3.1),  if  H(x)  =  H(r)  =  H(l ),  then  we  encounter  §  in  the  integrand.  Hereafter,  define  §  =  1  and 
§•0  =  0.  If  F  satisfies  AS3,  it  can  viewed  as  the  cdf  of  an  extended  random  variable  X  which  equals  oo 
with  positive  probability. 

Proposition  3.2.  Suppose  that  {fn}n> i  w  a  sequence  of  monotone  functions  on  an  interval  [a,b)  and  f(x) 
is  a  bounded  monotone  and  right  continuous  function  on  the  same  interval.  If  limn-Hx,  fn  (x)  =  f(x)  Vx6 
[a,  5)  and  lim^oo  fn(x-)  =  f(x~)  VxG  (a,  b],  then  lim„^oo  sup^^  |  fn(x)  -  f(x)  |  =  0. 

We  shall  present  the  proof  of  Theorem  3.1  after  we  prove  Theorem  2.1.  We  omit  the  proof  of  Proposition 
3.2  as  it  is  similar  to  Lemma  3  of  Yu  and  Li  (1994). 

Proof  of  Theorem  2.1.  Let  f2  be  the  event  {limn_j.oo  Qn(l,r)  =  Q(l,r)  V  /  <  r}.  For  each  u  €  let  Hn 
be  a  solution  of  (2.2).  We  shall  prove  the  theorem  in  2  steps. 

Step  1  (limn-^00Hn(x)  =  F(x)  and  ]imn->00  Hn(x—)  =  F(x-)  V  x  <  r).  Since  { Hn}n>i  is  bounded 
and  monotone,  for  each  subsequence  of  natural  numbers,  by  Helly’s  selection  theorem,  there  exists  a  further 
subsequence,  say  {nfc},  such  that  limnfe^oo  Hnfe(x)  =  H(x)  and  limnfc_>oo  Hnk  (x~)  =  H*(x)  pointwisely  for 
some  H  and  H *  G  0,  respectively.  Thus  it  suffices  to  show  that  H(x)  =  F(x)  and  H*(x)  =  F(x—)  for  all 
x  <  r. 

Since  Qn  converges  uniformly  to  Q ,  and  Hn  satisfies  (2.2),  by  the  bounded  convergence  theorem  (BCT) 
H  satisfies  (3.1)  and  H*  satisfies  a  similar  equation  like  (3.1).  Theorem  3.1  yield  the  first  desired  equation 
H(x)  —  F(x)  on  ( — oo,  t]. 

By  ASl.a,  r  >  r  =>  r  =  oo  and  thus  H(r)  =  F(r)  =  1  a s  H  E  Q.  Then  equation  (3.1)  and  its  analog 
for  H*  yield 


F[x-]  -  L<.  r) + L  + u  -  w + u). 


H 


l<x<r  F(r)  —  F(l) 
f  H*(z)-F{l) 
[>  Jl<X<r  F(V)-F(1) 


dQ(l,r)+  [  dQ(l,r), 

J  r<x 


as  H  =  F  on  (— oo,  r]  U  {oo}.  The  latter  two  equations  yield 


H*(x)  —  F(x—)  =  ( H*{x )  —  F(x—))c(x),  where  c(x)  =  f 


i<x<r  F(r)  -  F(l) 


dQ(l,r). 


(3.2) 
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By  AS1,  c(x)  =  |  J  _  p°^V  r)  <  1~  1  if  x  =  rand  F(t-)  <  1  11  follows  from  e(luation  (3-2)  and  c(x)  <  1 
that  H*(t)  =  F(r — )  if  F(t—)  <  1,  and  H*(x)  =  F(x—)  Vx<r.  In  order  to  show  that  H*(r)  =  F(r—)  if 
F(r—)  =  1,  let  Xk  t  t.  Note  Hn(xk)  <  Hn(r -)  <  1.  It  yields  H(xk)  <  H*(t)  <  1.  Now  lim^oo  H(^k)  = 
linifc->oo  =  1.  Thus  H*(t)  =  1  =  F(r-). 

Step  2  (conclusion).  By  step  1  the  sequence  {i?n}n>i  and  F  satisfy  all  the  conditions  for  {fn}n> l  and  / 
in  Proposition  3.2,  respectively,  where  (a,  b)  =  (-oo,  r).  By  Proposition  3,2,  limn_>oo  supa,<r  \Hn(x)—F(x)\  = 
0  V  u)  £  0.  Since  P{fi}  =  1,  Theorem  2.1  follows,  a 

The  solution  H(x)  to  (3.1)  is  unique  for  x  <  n  if  AS3  holds  by  Thereom  3.1,  but  Theorem  2.1  is  false 
if  only  AS3  holds,  as  fl<n<r  F{r)-F(i)^Q(^r)  =  1  if  P(T  <  rt)  =  1  and  P(V  <  rt)  —  1.  The  rest  of  the 
section  is  devoted  to  prove  Theorem  3.1. 

The  theorem  is  trivially  true  if  F(r)  —  0,  so  without  loss  of  generality  (WLOG),  we  can  assume  F(r)  >  0. 
The  outline  of  the  proof  is  as  follows.  We  first  define  a  functional  ip(h)  for  h  £  0.  We  then  show  that  h  —  F 
uniquely  maximizes  ip(h)  for  h  £  0  (Lemma  3.3)  and  that  each  solution  H  of  (3.1)  in  ©  is  a  maximum  point 
of  ?/>(•).  Thus  H  must  equal  F.  To  this  end,  some  notations  and  lemmas  are  needed. 

Verify  that  there  are  at  most  countably  many  intervals  (y,  z)  such  that  (1)  y  <  z  and  y  <  r,  (2) 
F(y)  =  F(z-),  and  (3)  y,z  €  Sp-  Let  y(x)  =  [-F(x)  +  Gu(x)  +  Gv(x)  +  Gt(x)]/4.  For  i  >  1,  denote  Di  the 
collection  of  intervals  (y,  z)  satisfying  (1),  (2),  (3)  above  and  fi(z-)  —  /x(y)  >  1/i,  then  Di  contains  finitely 
many  intervals  since  /i(-)  is  a  cdf.  Thus  U {Di,  the  collection  of  all  such  intervals,  is  countable.  Denote  Df 
the  set  of  left  endpoints  of  intervals  in  D{ . 

For  a  =  1,2,  ...,  denote  the  collection  of  all  possible  j 2~a  x  100  percentiles  of  the  distribution  (i 
(1  <  j  <  2a)  which  are  contained  in  (— oo,  r].  Note  that  for  each  j  such  that  j 2”a  <  fi(r)  the  corresponding 
percentile  is  given  by  y  =  sup{x  :  fi(x)  <  j2~a}.  Let  Ba  =  (Ba, i  U  D% )  U  {r}  and  denote  b±  <  •  •  •  <  b$  =  r 
to  be  the  elements  of  Ba.  Verify  that 


-fi{bi-x)  <  2  a,i  =  2,...,/?. 


(3.3) 


Define  bu  =  bi  and  b ;*  =  sup{x  :  x  <  bi,  F(x)  =  F(bi-i)},  i  —  2,...,/3.  Moreover,  if  r  <  oo,  then 
denote  &/3+1*  =  r  and  bp+i  =  oo.  For  bi ,  bj  £  Ba,  define 


(Ua,  Va)  =  (bu  bj)  if  bi<U  <  bi+ i,  bj- i  <V  <bj,  i  <  j  <  /3. 

Then  P{X  £  (fy-iA]}  =  P{X  £  as  P{X  £  (bi-i,bu)}  =  0.  Define  an  interval 


/«=  < 


( — oo,  bi]  if  /C  —  2  and  X  <b{  —  Ua, 

(bu  bj]  if  K  =  2,  X  G  (bi,  bj]  and  (Ua,  Va)  =  (bu  bj ), 

(&i,  oo]  if  X  >  bi  and  either  /C  =  2  and  Va  =  6^  or  /C  =  0  and  Tq,  =  6Z*, 

L&i*,  6»]  if  X  £  [ b^,  bi ],  /C  =  0  and  Ta>bi. 


(3.4) 


(3.5) 


Then  the  number  of  distinct  realizations  Ia ^  of  the  random  interval  Ia  is  finite.  Denote  the  joint  cdf 
of  (Ua,Va)  by  Ga  and  the  cdf  of  Ta  by  Gtq.  Let  La  and  Ra  be  the  endpoints  of  the  interval  Ia,  Qa(lyr,k) 
the  joint  cdf  of  (La,  i?a,/C),  and  qa,h,k  =  Pilot  =  =  k ).  Abusing  notations,  let  Q{l,r,k)  be  the  joint 

cdf  of  (L,  i?,  K).  Thus  Q(l,r)  can  be  viewed  as  the  marginal  cdf  of  (L,  R ). 

For  H  £  ©,  define  hh  to  be  the  measure  induced  by  H  and 


1pa(H)  =  E[ln(v,H(Ia)/VF(Ia)]  (=  ^  QaAkkl[VH (la, h)/^F(I(x, />)])• 

h,k 


(3-6) 


Here  we  interpret  InO  =  — oo,  OlnO  =  0  and  Olnoo  =  0.  It  is  obvious  by  construction  [see  (3.3),  (3.4)  and  (3.5)] 
and  by  AS l.a  that  the  measures  dGa,  dGra  and  dQa  converge  setwisely  to  dG,  dGr  and  dQ,  respectively. 
We  call  ij)(H )  a  limit  of  >  1}  if  a  subsequence  of  {ipa(H)}  converges  to  ^>( H ),  where  'ip(H)  may 

be  oo. 
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The  proofs  of  the  following  2  lemmas  are  given  in  Appendix  A. 

Lemma  3.3,  Suppose  that  if  G  0  and  either  AS1  or  ASS  holds.  Let  'ip(H)  be  a,  limit  of  {^>a(if)}.  Then 
(1)  'ip(H)  =  0  if  and  only  if  H(x)  =  F(x)  for  all  x  <  r,  and  if(r*)  =  F(rt)  in  the  case  F(rt —)  <  1  and 
P(T  or  V  =  rt)  >  0;  (2)  ip(H)  <  0. 

A  real  number  x  G  [rG,r]  is  called  a  left  point  of  increase  of  F  G  0  if  F(x)  —  F(x  —  e)  >  0  for  each 
e  >  0.  Let  Cf  be  the  set  of  all  left  points  of  increase  of  F. 


Denote 
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P(lG(a,p<T,K  =  0) 
H(b)  -  if  (a) 


Lemma  3.4.  Suppose  that  if  is  a  solution  of  (3.1),  AS1  or  ASS  holds ,  and  b  G  Cf-  Then 
(E.l)  f  =  1  if  F(; r)  <  1; 

(E.2)  711  (a,  b)  <  1  /or  each  a  <  b;  (E.3)  fl<r  w^fr0[jdQ(l,  r)  +  limat6  b)  -  1  =  0. 

Proof  of  Theorem  3.1.  Let  if  be  a  solution  of  (3.1).  We  shall  assume  that  H(x)  ±  F(x)  for  some  x  <  r 
but  AS1  holds,  or  H(x)  /  F(x)  for  some  x  <  r  but  AS3  holds;  and  show  that  it  leads  to  a  contradiction. 

Let  i)(H)  be  a  limit  of  i/>a(if).  WLOG,  assume  lim^oo  ^a(if)  =  Since  if  ^  F  for  some  to  £  T, 

'0( F )  =  0  >  'ip(H)  by  Lemma  3.3.  Therefore,  there  exists  an  integer  a\  such  that  ipa(F)  >  'ipa(H)  +  6 ,  for  all 
a  >  ai,  where  6  =  -^(if)/2  >  0.  For  each  a  >  oi,  let  pi  =  ®  =  1, /?,  and  p/3+1  =  1  -  F(r). 

It  is  seen  that  bi ,  /3,  and  pj  all  are  functions  of  a.  Then,  for  a  >  ai,  the  above  inequality  yields 


5<-MH)  +  MF) 

Um  jiW  +  ifgMF)  - 

u4-0  w 

<  lim - - -  (since  — ln(»)  and  hence  —^(0  is  convex) 


t40 


Mtf+uF  (/q.i) 


=  lim  ga>j,fcln  ~ 

Tin  u 

^  +  sfef )  -  Ml  +  «)] 

140  u 


(by  (3.6)) 

f  F(r)-F(Q  f  gq^.0  1 

Jl<r  and  fc=2,  or  r=oo  H(r)  -  H(l)  ^  ^  M)  ’ 


(3.7) 


where  j{  is  such  that  /Qiii  =  *  =  1,-,/L  Let  hi(Z,r)  =  and  h2{bi„bi)  = 

(E.l),  (E.2)  and  (E.3)  in  Lemma  3.4,  fl<r  dQ (l,  r)  <  1,  thus 


/3+1 


oo  >  lim 

a— too 


SB  £ Pi  [  - 1-j^bWmdQ(l, r)  (by  the  BCT) 

-►oo  J=1  iz<r  il  (r)  —  ti  (/) 


limq-^oo  ^2j- 1  Pi^(bj€(/,r]) 


"/<r 

=  [  hi(ltr)dQ(l,r). 

Jl<r 


dQ(l,r)  (by  Fatou’s  lemma) 


Since  hi  is  a  nonnegative  measurable  function,  (3.8)  implies  that  it  is  integrable.  Since 

fe,*, o  <  P(X  G  L bi+,bi),X  <TayX  =  0) 


(3.8) 


(3.9) 
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by  the  definition  of  Ua,  Va,  Ta  and  Ia  [see  (3.4)  and  (3.5)],  (E.2)  and  (3.9)  imply  that  |/&2(&i*j  Ml  ^  1»  and 
thus  X^f=iPi^2(^i*,5i)  converges  by  the  BCT  as  a  — >  oo.  Then 

0  <  S  <expression  (3.7) 

<  Mm  [  [  +  '^2pih2(bit,bi)  -  l\ 

ot—^oo  L  J  l<r  and  k= 2,  or  r— oo  •  -  -* 


i—1 


r  0 

—  hi(l, r)dQ(l, r)  +  )im  'y'pih,2(bu,bi)  -  1  (since  dQa  — >•  dQ  setwisely) 

J l<r  a— too  i=1 

r  0 

<  hi(Z,  r)dQ(l, r)  +  Um  y%i7ff(Z>i.,&i)  -  1  (by  (3.9)) 

J l<r  a— too 

0+1  .  1  _  0 

J  H$€h\i)  m'  r)  +  a^L^  PnH{bi*'  bi)~1  (by(3>8)) 

££“|>LL  %<W.-->  +  ™<fc..fc>  - 1]  (by  (E.1)) 

[/ «H +  “ff7gM> ~ ^  <bytheBCT> 

=0  (by  (E.3)). 

Thus  we  reach  a  contradiction  0  <  5  <  0.  This  concludes  the  proof  of  Theorem  3.1.  □ 


4.  Asymptotic  Normality 

If  F(t)  =  0,  the  GMLE  F(r)  =  0  w.p.l.  If  F(r)  <  1,  F(t)  is  not  identifiable  for  t  >  r.  Thus  it  suffices 

(  F(t)  if  t  <  r 

to  estimate  Fr  defined  by  Fr(t)  —  <  F(r)  if  r  <  t  <  oo,  and  assume  that  F(t)  >  0.  Here  Fr  £  ©  but  may 

l  1  if  t  =  oo 

not  be  a  cdf  and  Theorem  3.1  does  not  require  F  be  a  cdf. 

There  are  two  equivalent  forms  for  equation  (3.1):  H  —  Bh{Q)  and  H  =  1ZH(F),  where 


Bh(Q)(x)  =  [  ^fldQil,  r)+  f  dQ(l,  r )  (RHS  of  (3.1)), 

Jl<x<r  tt(r)  -  H{1)  7r<x 

KH(F)(x)  =  Ji<xjHtr)~m)[F{r)  "  m  ~  [F(:C)  “  FmdG*(l,r)  +  F(x), 

(  7T2 dP(V  <l)  +  7ro dGril)  if  r  =  oo, 
dG* (l,r)  =  <  7r2dP(C/  <  r)  if  Z  =  -oo, 

[  TT2dG(l,  r )  if  -oo  <  l  <  r  <  oo, 

<*Q(J,r)  -  [F(r)  -  F(Z)]dGT(Z,r)  =  [FT(r)  -  Fr(Z)]dG*(Z,r)  if  Z  <  r. 


(4.1) 

(4.2) 

(4.3) 


Lemma  4.1  .  Pf/n(Qn  -  Q)  =  7 lHn{Hn  -  Fr)  for  each  SCE  Hn  which  satisfies  (2.3). 

The  proof  of  the  lemma  is  in  Appendix  B. 

Let  V  be  the  collection  of  all  real-valued  functions  h  defined  on  [— oo,  oo]  that  are  right-continuous,  have 
left  limits  at  each  point  and  satisfy  that 


V  a  <  b  <  oo,  FT(o-)  =  FT(b)  =►  h(a~)  =  h(b). 


(4.4) 


Define  V0  =  {h  E  V  :  F(x)  =  0  =>  h(x)  =  0;  JPr(a:— )  =  1  =>  h(x-)  =  0}.  Verify  that  (V,  ||  •  ||)  and  (I>o,  ||  •  ||) 
are  both  Banach  spaces.  Let  (P2>  1  Ml)  be  a  Banach  space  of  real- valued  functions  defined  on  [—oo,  oo]2  such 
that  the  Banach  space  contains  all  bivariate  cdfs,  where  ||^||  =  snpx  y  \g(xy  y)\.  Note  that  AS1-  AS3  are 
basically  assumptions  on  (F,  G,  Gt )•  We  say  ( H ,  G,  Gt)  satisfies  AS1  etc.,  if  H  E  0  and  H  replaces  the  role 
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of  F  in  AS1  etc.  Let  0O  =  {H  E  0  fi  V  :  Sh  C  Sf,  (#,  G,  Gr)  satisfies  AS1  or  AS3}.  For  each  H  E  0O, 
7^h(*)  and  Bh(*)  are  linear  operators  on  V  and  V 2,  respectively. 

Theorem  4.1.  Suppose  that  AS1,  AS2  and  (2.3)  hold.  Then  7 Zp*  exists  as  a  bounded  operator  from  V  to 
V  and  the  SCE  satisfies 

Vn(Hn  -FT)  A  KgBFr  ( W )  in  V,  (4.5) 

where  W  is  the  Gaussian  process  specified  by  y/n(Qn(l,r)  —  Q(/,r))  W . 

We  first  state  3  more  lemmas,  with  their  proofs  relegated  to  Appendix  B. 

For  a  F  £  0O,  let  Ck  be  the  collection  of  all  the  distinct  points  among  c^s,  where  Ck,i  =  inf{x  :  F(x)  > 
i/2k},  i  =  0,  2k,  k>  1.  Let  F*  be  a  step  function  in  0O  such  that  Fk(c)  =  F(c)  for  each  c  e  Ck  and  its 

discontinuity  points  belong  to  Ck-  Denote  T>k  (Vko)  the  subclass  of  V  {(Do)  such  that  each  member  is  a  step 
function  with  the  collection  of  discontinuity  points  being  a  subset  of  Sj?fc.  Obviously,  Dk,  Ck  and  Fk  depend 
on  F. 

Lemma  4.2.  If  F  E  Q0,  then  the  linear  operator  7 Z]?*  exists  as  a  map  from  T>k  onto  Vk  - 
Lemma  4.3.  Assume  that  AS1,  AS2  and  (2.3)  hold.  For  each  oj  £  Q,  Hn  E  00- 
Lemma  4.4.  If  F  E  ©o,  then  ||7^pfc1(-)||  <  1  for  all  possible  k. 

Proof.  We  give  the  proof  of  asymptotic  normality  in  4  steps. 

Step  1  (Existence  of  TJ^1,  F  E  0O,  as  a  linear  operator  from  V  to  V).  For  each  g  E  V  and  k  >  1,  let 
gk  E  Vk  be  such  that  gk{x)  =  g(x)  if  x  E  Ck-  Then  \\gk  —  g\\  ^  0,  since  Sgk  and  Sg  C  Sp  and  C  —  U kCk  is 
dense  in  Sp.  By  Lemma  4.2,  11]?*  exists,  so  'there  exists  a  unique  hk  E  Vk  such  that  gk  =  1ZFk(hk)-  V  K  >  k 
and  V  h  e  V,  Vk  C  Vk,  ||Ffc  —  Fk\\  <  l/2k  and  1ZFk(h)  —  7 lFK{h)  converges  to  0  as  k  — >  oo  by  the  BCT. 

1  |7^Ffc  (*)  1 1  ~  1  by  Lemma  4.4,  thus  limfc^oof^FfcW  “  7£^(/i)]  =  0  V  h  E  Vk  and  V  k  >  1.  Furthermore, 

II h  -  hK ||  <\\K$(gk)  -  n^K(gk) ||  +  ||^(flfc)  -  ^(flic)ll 

— I I^Fk  (.9k)  ~  ftp*(Sfc) II  +  II^FkII  -  Ibfc  —  9k\\  0  as  A:  -4  oo, 

by  the  assumption  ||#fc  -  #||  — >■  0,  Lemmas  4.2  and  4.4,  and  the  BCT.  That  is,  \\hk\\  is  a  Cauchy  sequence. 
Since  V  is  a  Banach  space,  there  is  a  function  hQ  E  V  such  that  ||/ifc  —  hQ\\  0.  By  the  BCT,  g  = 
limfc-^oo  KFk(hk)  =  Fp\h0).  Define  =  TVp-{g). 

Step  2  (Strong  continuity  of  {TZJj1  :  H  G  0o}).  Let  g'm  E  T>  and  Hm  £  0O  be  such  that  —  g||  — >  0 
and  || Hm  —  FT||  -*•  0  as  m  ->•  oo.  Then 

W^Jg'J  -  n£(g) ||  <|| n-Hlm{g'm)  -  TlgtfJ ||  +  \\n^(g'm)  -  Kg{g)\\ 

<\\KhI  ~  Kpl ||  •  Hfl^H  +  Ilft^H  •  || g'm  —  ff||  — >  0  as  m  — >•  oo. 

Step  3  (Strong  continuity  of  {Bh  -  if  E  0O}).  Let  h  be  a  simple  function  in  £>2*  It  follows  from  (4.1) 
and  the  BCT  that  Bh (h)  Bft  (h)  in  V  as  H  ->•  FT.  Since  ||/?h ||  <  4  V  JT  E  0O  and  the  collection  of  simple 
functions  is  dense  in  V2,  we  have  strong  continuity. 

Step  4  (Conclusion).  By  Lemma  4.3,  Hn  E  0O.  Thus  7Z]^n  exists  by  Step  1.  It  follows  that  y/n(Hn  — 
Fr))  —  Hn {y/n[Qn  -  Q])  by  Lemma  4.1.  By  Theorem  2.1  limn^oo  | Hn(x)  -  FT(x) \  =  0  a.s.  By  Steps 

2  and  3,  [Fh  =  7 Z]^Bh  -  H  E  0O}  is  strongly  continuous.  As  a  consequence  of  the  above  4  statements,  and 

the  Banach-Steinhaus  theorem,  sup{||^rffn(/i)  —  FFT{h) ||  :  h  E  A(e)}  0  a.s.  as  n  — ¥  00  and  then  e  0+ 

for  all  compact  set  A  C  X>2,  where  A(e)  —  {h  eV 2  :  \\h  -  h'\\  <  e  for  some  hf  E  A}.  By  the  central  limit 

theorem,  Wn  —  y/n[Qn  —  Q\  W  in  V2,  {Wn}  is  uniformly  tight  (Pollard,  1984,  p.81).  As  a  consequence, 
\\y/n(Hn  -Fr)  -  FFr(Wn)\\  =  \\{FHn  -FFT){Wn) ||  =  op(l),  which  imphes  (4.5)  by  the  continuous  mapping 
theorem  (Pollard,  1984,  p.  70).  □ 

Remark  3.  Our  proof  of  the  normality  (not  the  consistency)  relies  on  the  form  (2.3).  It  can  be  shown  that 
Theorem  4-1  are  actually  true  without  (2.3),  and  Theorem  2.1  (not  Theorem  4-1)  are  true  without  (ASl.a). 
For  the  sake  of  simplicity,  we  skip  the  details. 

Theorem  2.2  is  a  consequence  of  Theorem  4.1. 

Remark  4.  Under  assumptions  AS1  and  AS2,  Hn  is  also  efficient .  The  proof  is  analogous  to  that  of  Theorem 

3  of  Gu  and  Zhang  (1993)  and  is  skipped  here. 
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Appendix  A 

We  shall  prove  Lemmas  3.3  and  3.4.  A  lemma  is  needed  to  prove  Lemma  3.3. 

Lemma  A.l.  Assume  that  AS1  or  ASS  holds.  Let  'tp(H)  be  a  limit  of  {ii)a{H)}f  H  £  0.  Then  ip(H)  =  0 
if  and  only  if  (1)  H(t)  =  F(t)  and  H(t — )  =  F(t-)  V  t  G  Sf  fl  U aBa  D  (-oo,r),  (2)  H(t—)  =  F(r— )  if 
F(t— )  <  1  and  (3)  H(r)  =  F(r)  if  F{r—)  <  1  and  AS1  holds.  Moreover ,  ip(H)  <  0. 

Proof.  (=»)  Verify  that  rpa(F)  =  0  for  all  a  by  ASl.a,  and  thus  lima^oo  ^a(F)  =  0.  Then  conditions  (1)  - 
(3)  above  imply  that  ^a(H)  =  tpa{F)  =  0  for  all  a  >  1.  Thus  ip(H)  =  0. 

(<=)  We  first  show  that  'tp(H)  =  0  imphes  condition  (1).  It  suffices  to  show  that  H )  <  0  if  for  some 
to  £  Sf  H  U aBa  D  (— oo,r)  either  (l.a)  H(to)  ^  F(i0)  or  (l.b)  H(to —)  ^  F(to—).  Condition  (l.a)  imphes 
that  for  each  sufficient  large  a,  there  is  a  point  bh  €  Sf  H  Ba  such  that  bh  —  to.  Verify  that 


=*2  j  J  fa,2{z,y)dGa(z,y)  +  w0  j  fa,o(t)dGT,a(t),  where 

=F^m + ™  + 1‘  - 

+  £  + 11  -  ^,lkT^)  ’ 


(A.  1) 


and  to  and  bj  £  Ba.  Note  t0  is  fixed  but  the  index  h  of  bh  =  t0  depends  on  a.  Define 

9(m) = ( + [i - nwMEffg  if*,  st, 

1 0  otherwise. 


(A.2) 


Then  0  =  ln|F(*0)$$  +  [1  -  i^o)]iE#}g]  >  ^o)ln#g}  +  [1  -  F(t0)} =  ff(0,t),  for  t  >  t0,  as 
-ln(-)  is  strictly  convex  and  F(t0)  /  H(t0).  Moreover,  P{T  or  V  >  to}  >  0  as  no  >  0  and  to  €  (-oo,  r).  It 
foDows  from  the  above  two  statements  that 


P{0>g(/C,T)j  >  0. 


(A3) 


It  is  obvious  that  (l.a.l)  g(2,  t)  >  fa,2(u,v)  for  each  (u,v,t)  and  (l.a.2)  0  =  g(0,  t)  >  fa,o(t)  for  t  <  to.  We 
shall  show  that,  (l.a.3)  g(0,t)  >  fa,o(t),  for  t  =  bk  >  to,  where  bk  €  Ba  and  a  is  sufficiently  large.  Let 
fgdG%  =  n2  f  f  3(2,  t)dGa (u,  v)  +  n0  f  g(0,t)dGTa{t),  and  define 

f  gdGw  in  an  obvious  way.  Then  (l.a.l),  (l.a.2)  and  (l.a.3)  imply  that  f  gdG “  >  ipa(H).  Since  dG™ 
converges  to  dGw  setwisely  by  observing  that  dGa  (1 dGr a)  converges  to  dG  (dGr)  setwisely  and  g(k,t )  is 
a  binary  function  in  (u,  v,  t,  k),  the  desired  result  follows  from  (A.3)  and  0  >  /  gdGw  —  lim  f  gdG “  > 

a— >-oo 

lim  ipa(H)  >  ip(H). 

a— >oo 

We  now  establish  (l.a.3).  Let  tQ  ~bh  <  bj  =  t  for  some  integer  0to .  It  is  easy  to  see  by  our  construction 
that  Bai  C  Ba2  if  ai  <  #2  and  hence  to ,  t  €  Ba  for  all  a  >  aQ.  For  each  z  =  6*  £  Bai  such  that  2  <  £0> 
verify 


(i) 


(ii) 


g(0,t)  =  F(f0)ln{ 


Hjz)  F(z)  mo)  -  H(z))  F(tp)  -  F(z) 
F(z)F(t0)  ^  (F(t0)-F(z))  F(t0) 


} 


+  [1  -  F(to)]ln{ 


(H(t)  -  H(t0))  F(t) 


In 


H{b)  -  H(a) 


(F(t) 
F(x) 


F(to)) 


F(b)  -  F(a)  ~ 
for  all  x  e  (a,  6). 


F(a)hiH(x) 


1  -  F(t0) 
-H(a) 


F(to)  +  1 


m  1  -p(t) 


1  -  F(t)  1  -  F(to) 


,  F(b)  -F{x)  H{b)' 


h 

H(x) 


F(b)  —  F{a)  F(x)-F(a)  F{b)  -  F(a)  F{b)  -  F(x) 


In  view  of  (i)  and  (ii),  (l.a.3)  follows  by  an  induction  argument. 
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Now  consider  condition  (l.b).  If  to  is  a  point  satisfying  condition  (l.b),  then  either  (l.b.l)  to  €  Sf  H 
(U aBa)  H  (UaB*),  where  B^  =  {x:x  =  b^>  6Z*_ iMM-x  €  Ba},  or  (l.b.2)  t0  G  <Sf  H  (UaFa)  H  (l UaB*)c , 
where  Ac  is  the  complement  of  the  set  A . 

First  assume  (l.b.l).  For  each  sufficiently  large  a,  there  exists  a  bh*  =  t0  e  B*.  Thus  replacing  t0  by 
to —  in  the  proof  for  situation  (l.a)  yields  ip(H)  <  0. 

On  the  other  hand,  in  view  of  (3.3),  (l.b.2)  implies  that  F(to~)  >  F(t)  for  each  t  <  to  and  hence 
there  exists  a  sequence  of  points  Xj  G  fl  U a(Ba  U  B *)  such  that  X{  t  to  with  either  H(xi)  ^  F(xi)  (if 
Xi  =  bj*  =  bj-i)  or  H(x{— )  ^  F(x{— )  (if  X{  =  bj*  >  6j-i).  In  either  case,  it  reduces  to  situation  (l.a)  or 
(l.b.l).  Thus,  we  have  'ijj(H)  <  0.  This  concludes  the  proof  for  condition  (l.b). 

The  proofs  for  conditions  (3)  and  (2)  are  similar  to  that  for  conditions  (l.a)  and  (l.b),  respectively,  except 
in  the  proof  for  condition  (3)  replacing  in  the  above  proof  the  statement  P{to  <  T }  >  0  by  P{T  or  V  — 
rt}  >  0  (as  AS1  holds).  We  omit  the  details. 

Verify  that  we  actually  show  that  either  ip(H)  =  0  or  'ip(H)  <  0.  Thus  rp( H )  <  0.  □ 

Proof  of  Lemma  3.3.  Statement  (2)  follows  from  the  last  statement  in  Lemma  A.l.  To  prove  statement 
(1)  ,  it  suffices  to  show  that  conditions  (1),  (2)  and  (3)  in  Lemma  A.l  imply  H(x)  =  F(x)  Vx<r,  i.e.  the 
sufficient  and  necessary  condition  in  Lemma  3.3. 

If  £  is  a  discontinuity  point  of  F  and  x  <  r,  then  there  exists  an  integer  N  such  that  F(x)  —  F(x— )  >  2~Q 
for  all  a  >  N.  This  implies  that  z  is  a  certain  j 2~N  x  100  percentile  of  \i  and  thus  x  G  Ba  fl  5i?.  It  follows 
that  Sf  fl  U aBa  contains  all  discontinuity  points  of  F  which  belong  to  (— oo,r].  Thus  conditions  (1),  (2) 
and  (3)  of  Lemma  A.l  imply  H(x)  =  F(x). 

Suppose  now  a?  is  a  continuity  point  of  F .  Let  ux  =  inf{y  :  F(y)  =  F(x)}  and  vx  =  sup{t/  :  F(y)  = 
F(x)}.  If  both  ux  and  vx  belong  to  we  are  done,  as  F(x)  =  F(ux)  =  H(u)  <  H(x  )  <  H(vx~)  = 

F(vx — )  =  F(x)  by  conditions  (1),  (2)  and  (3)  in  Lemma  A.l. 

If  neither  ux  nor  vx  belongs  to  *SirnUQj5a,  then  from  the  above  discussion  both  ux  and  vx  are  continuous 
support  points  of  F  satisfying  F(ux)  =  F(vx)  =  F(x),  and  there  exist  two  sequences  of  support  points  of  F, 
say  {zi}i>i  and  {yj}j> i,  which  are  contained  in  Sf  H  U aBa  such  that  Xi  t  ux  and  yj  J,  vx .  Consequently, 
F(xi)  —  H(x{)  <  H(x)  <  H(yj)  —  F(yj)  by  conditions  (1),  (2)  and  (3)  in  Lemma  A.l.  This  yields 
H(x)  -  F(x)  as  F(xi)  F(ux)  and  F(yj)  F(vx). 

For  simplicity,  we  skip  the  proof  for  the  case  that  only  ux  or  vx  belongs  to  Sf  H  (U aBa).  This  concludes 
the  proof  of  the  lemma.  □ 

A  lemma  is  needed  for  proving  Lemma  3.4. 

Lemma  A. 2.  Suppose  that  H  is  a  solution  of  (3.1)  and  A  is  an  interval  (a, b]  C  (— oo,r].  Then  plf{A)  > 

0  =>  p>h{A)  >  0. 

Proof.  Equation  (3.1)  is  equivalent  to 


=  M  g(f;)r])  dQ& r)  +  p(x  e{a,b],X<T,JC  =  0).  (AA) 

If  H  is  a  solution  to  (3.1),  then  for  each  interval  A  =  (a,  6]  C  (— oo,  r]  such  that  Pf(A)  >  0,  we  have 
^h(A)  >  P(X  e  A,  X  <  T,  /C  =  0)  >  0  by  the  assumption  7ro  >  0  and  b  <  r.  This  concludes  the  proof  of 
the  lemma.  □ 

Proof  of  Lemma  3.4.  Assume  that  H  is  a  solution  of  (3.1)  and  b  e  Cf •  By  Lemma  A.2,  ((a,  b])  >  0  V 

a  <  b.  Dividing  both  sides  of  equation  (A.4)  by  pnii^b])  yields 


-L 


M(M]n(I,r])  „,/(!£(.  6],X<T,K  =  0) 

r) + — wrm — 1  “ <  <A5) 


For  each  a  <  6,  (A.5)  yields  (E.2)  as  the  two  summands  in  (A.5)  are  nonnegative. 

n  .  mt(  hi  \  J°  if/iff((a,6]n(Z,r])  =  0, 

Denote  dH(a,b,l,r)  =  l  w((q,t]n(i,r|)  otherwise 

l  [H(b)-H(a)\[H(r)-H(l)]  otherwise. 

For  each  pair  ( l,r )  such  that  l  <  b  <r,  we  have  H(r)  —  H(l)  >  0  by  Lemma  A.2.  Moreover, 


dH(a,b,l,r)  t 


L(*e(i,  r]) 


H(r)  -  H(l) 


as  a  f  b  if  b  €  (l,  r),  and  dH(a ,  6,  l,  r)  4-  0  as  a  t  b  if  b  >  r. 
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Thus  by  the  monotone  convergence  theorem,  as  a  t  we  have 

1 8H(a,  b,  l,  r)dQ  =  jf  ^  dH(a,  b,  l,  r)dQ  +  jf  0JT(a,  6,  Z,  r)<ZQ  -*•  J  dQ- 

The  desired  equation  (E.3)  follows  from  (A.5),  (E.2)  and  the  above  equation. 

Assume  now  0  <  F(r)  <  1  and  AS1  holds.  By  ASl.a,  l<rt<r=>r  =  oo,  thus 

Hh((t,oo})=  [  y>"S^,r)  (by  AS1  and  (A. 4))  (A.6) 

Jl<T<r  M\?)  ”  -“W 

>  [  dQ(l ,  r)  =  (1  —  P(r))P(L  =  r)  >  0  (by  ASl.b). 

*/  l=T<r 

Dividing  both  sides  of  equation  (A.6)  by  fin  (fa,  oo])  yields  (E.l)  under  AS1. 

On  the  other  hand,  assume  that  0  <  F(r)  <  1  and  AS3  holds.  Note  that  even  if  we  encounter 

(r)^°H(0,r^  Mxo<Kr)  =  1  by  convention.  By  AS3,  P(z  <  T  <  rt)  >  0  for  each  x  <  rt.  (A.4)  yields 


/»»((*. 


m((x,  oo]  n  (z,r]) 


<ZQ(Z,r) 


’  J/  7i<r  H(r)-H(l) 

>(1  -  F(rt))P(T  €  (x,  T(])  >0,  [x0,Tt). 

Then  dividing  both  sides  of  (A. 7)  by  /i# ((z,  oo])  and  taking  limits  yield 


=  lim  f 

*Tn  Jic 


Mg((x,OQ]n(Z,r])  /■  1(t<rt<r)  rfoa  x 

/*»((*> oo))(JT(r)  -  H(l)) dQ^1,  '  ~  J  H(r)-H(l)dQV'r) 


(as  rv  <  rt  and  thus  l<rt<r^r  —  oo),  which  is  (E.l).  □ 

Appendix  B 

In  this  appendix,  we  prove  lemmas  in  Section  4. 

Proof  of  Lemma  4.1.  Theorem  3.1,  (4.1),  (4.2)  and  (2.3)  yield  BHn(Qn){%)  =  Hn(x)  and  ']Zft(Ft)(x)  = 
Fr(x)  V  x.  Furthermore, 


f  Hn(x)-Hn(l) 
liCxcr  Hn(r)  —  Hn(l) 


{[Hn(r)  -  FT(r)}  -  [fln(0  -  Fr(l)]}dG*(l,r) 


=  f  [Hn(x)  -  Hn(l)]dG*(l,r)  -  [  ^—Ml[FT{r)  -  FT(l)}dG*(l,r) 

Jl<x<r  Jl<x<r 

=  f  [Hn(x)  -  Hn(l)]dG*(l,r)  -  [  ~  *?W)  (by  (4.3)) 

Jl<x<r  Jl<x<r 

=  f  [Hn(x)  -  H„(l)}dG*(l,r)  -  BHn{Q)(z)  +  P{iZ  <  x}  (by  (4.1)) 

=  [  [H„(x)  —  Hn(l)]dG*(l,r) 

J  l<x<r 

-  BHn (Q)(x)  +  [BHn  (Qn) (x)  -  H„(x)]  (since  BHn  (Qn)  =  Hn) 

+  [Fr(x)  -  f  FT{x)  -  FT(l)dG*(l,r)\  (=  P{R  <  x}  as  FT  =  HFt(Ft)) 

J  l<x<r 

=BHn(Qn  -  Q)(x)  -  [Hn(x)  -  Fr(x)}  +  f  {[Hn(x)  -  Fr(x)j  -  {. Hn(l )  -  Fr(l)]}dG*. 

J  l<x<r 

Translating  certain  terms  in  the  first  and  last  expressions  of  the  above  equations  yields 


[FT(r)-FT(l)]dG*(l,r) 


BhAQu-Q)(x) 
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=  [<x<r  g"(r)-g"((!)){[H"(r)  “  Fr(r)]  "  [Hn^l)  ~  Fr«)}}dG*(l,r) 

+  [Hn(x)  -  FT(x)}  -  f  {[Hn(x)  -  Ft(*)]  -  [Hn(l)  -  Fr(l)}}dG*(l, r ) 

J  l<x<r 

=  /  ( ~  ~  [FAr)  -  Fr(l)}} 

JI<X< r  \Hn{r)-Hn(! ) 

-  {{Hn(x)  -  Hn(l)}  -  [Ft(x)  -  Fr(l)]})dG*(l,r)  +  [Hn(x)  -  Fr(x)} 

=KHn(Hn-FT)(x)  (by  (4.2)).  o 

Lemma  B.l.  If  F  G  0O  and  1 Zp(h)  =  0,  where  h  G  V,  then  h  G  Vq. 

Proof.  For  each  h  G  V,  by  (4.2), 

Kp{h)(x)=  f  {ffif -  W  [ft(r)  -  h(l)}  -  (h(x)  -  h(l)]}dG*  (l,r)  +  h(x).  (B.l) 

Ji<x<r  F(r)  —  F(l) 

If  F(x)  =  0,  then  h  =  h{x)  on  (— oo,x]  by  (4.4).  Thus 

0  =  Kp(h)(x)  =  -  f^x<r[h(x)  -  h(l)]dG*{l,r)  +  h(x)  =  h(x ). 

Moreover,  if  Fr(x-)  =  1,  then  h  =  h(x— )  on  [x,  oo]  by  (4.4),  and 
0  =  Kjr(h){x-)  =  /i<:c<r[Mr)  “  h(x-)]dG*(l,r)  +  h{x~)  =  h(x~).  Thus  heVQ.n 

Proof  of  Lemma  4,2.  Note  F  G  0O.  Since  TZpk  is  a  linear  operator  on  the  finite  dimensional  linear  space 
£>&,  it  suffices  to  show  (1)  7 Zpk  is  1-1  and  (2)  7 ^Fk(pk)  C  Vk. 

Step  (1).  Suppose  TlFk(h)  =  0,  where  h  G  D*.  We  shall  show  that  h  =  0.  Denote  a  =  J2ceck  IMC)  ~ 
ft(c— ) |  and  m  =  min{mc  :  mc  =  Ffc(c)  —  Ffc(c— )  >  0,  c  G  Cfc}.  Note  that  ra  >  0  and  a  is  finite  as  G*  contains 
finitely  many  points.  Choose  7  >  0  such  that  7a  <  ra.  Let  H  —  Fk  +  7/1.  Since  7^Ffc(^)  =  0  and  F  G  0O, 
/i  G  Pfco  by  Lemma  B.l.  As  consequences,  (1)  H(tq-)  =  Fk(r0 — )  +  0  =  0  and  il(oo)  =  Ffc(oo)  +  0  =  1;  (2) 
i?(c)  >  JT(c-)  for  ah  c  G  Ck  [as  H(c)  -  H(c-)  >  ra  -  7(h(c)  -  ft(c-))  >  ra  -  7a  >  0];  (3)  JT(x)  €  P*. 

It  follows  from  statements  (1),  (2)  and  (3)  that  H  =  Fk  +  7/1  G  0O  D  £>*;.  Then  7^Ffc(Lr)(x)  = 
^Ffc(Ffc)(x)  +  1ZFk('yh)(x)  =  Ffc(x)  +  0  for  each  x.  That  is  Fk  =  7lFk{H).  Note  that  (H,  G,  Gt)  satis¬ 
fies  AS1  or  AS3  as  H  €  0O.  Thus  F*.  =  if  =  Fk  +  7/1  by  Theorem  3.1,  which  implies  /i  =  0  as  7  >  0.  As  a 
consequence,  7^(0  is  1-1. 

Step  (2).  It  suffices  to  show  that  A  —  1 lFk(h)(b)  -  1ZFk(h)(a)  =  0  if  h  G  Vk  and  fj,Fk({a,b])  =  0.  Define 
^h((a,  b ])  =  h(b)  —  h(a).  By  definition  of  D*.,  M/i((a)  &])  =  0-  Then 

A  =  /</Ft^({4])’6i)^((Z’r])  ~  n  M])MG*(/,r)  +  /ifc((o,6])  =  0.  □ 

Proof  of  Lemma  4.3.  We  fix  cj  G  f2,  as  Hn  is  random.  We  shall  verify  that  satisfies  the  properties  of 
0O.  First,  by  AS2  rv  <  rt. 

If  a  <  b  and  F(a-)  =  F(6),  then  [0,6]  n  Sf  =  0.  It  follows  that  {Ei,...,^}  fi  [a, b]  =  0  by  AS2  and 
thus  Hn  satisfies  (4.4)  and  Hn  G  V  by  Convention  (2.3). 

If  Hn(rt-)  =  1  then  (fZn,G,Gr)  trivially  satisfies  AS1  and  thus  i?n  G  ©0.  Moreover,  if  F(rt — )  <  1, 
then  P(T  or  F  =  7)  >  0  and  thus  (#n,G,  Gt)  also  satisfie  AS1.  It  follows  that  Hn  G  0O*  Hence,  WLOG, 
we  can  assume  that  F(rt—)  —  1.  and  Hn(rt-)  <  1.  Then  either  P(R  =  rt)  >  0  or  P(R  =  rt)  =  0  If 
P(R  =  Tt)  >  0,  then  P(V  =  7)  >  0  as  F(r*-)  =  1.  /.e.,  (i?n,G,  Gt)  satisfies  AS1  and  Hn  G  0O.  If 
P(R  =  rt)  =  0  then  with  probability  one  Ri  ^  rt.  WLOG,  we  can  assume  that  Ri  ±  rt.  Let  xQ  be  the 
largest  Ri  that  is  smaller  than  rt.  Then  (2.3)  implies  that  M//n([xo>  r*D  =  0-  Moreover,  rv  <  rt  by  AS1. 
Hence,  (i/n,G,  Gr)  satisfies  AS3.  It  follows  that  Hn  G  0O.  a 

Proof  of  Lemma  4.4.  Let  01,  ...,  om  be  all  the  discontinuity  points  of  Fk.  Then  Vk  is  an  m-dimentional 
linear  space.  Define  hi(x)  —  l(x>0i)-  We  shall  show  that 

AFk  —  F,Fk{hi)(oj)  -  7 lFk(hi)(oj—)  >  0  for  each  j  and  for  each  h*.  (B.2) 
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Verify  that  7 Zpk(hi)  €  P*  (by  Lemma  4.2),  7^Ffc(^i)(0^)  =  0  and  HFk(hi)(oo)  =  Then 

(/&*)»  *  =  1>  to,  are  a  base  of  P*  and  \\TlFk{hi)\  \  =  1,  (B.3) 

by  Lemma  4.2,  as  P*.  is  an  m-dimensional  linear  space,  and 


hi,  i  =  1,  ...,  m,  are  a  base  of  P*  and  \\hi\\  =  1.  (B.4) 

(B.3)  and  (B.4)  imply  that  ||7?,^fc1||  —  1. 

The  proof  of  the  lemma  will  be  completed  after  we  prove  (B.2).  Letting  x  =  Oj,  h  =  hi,  F  =  Fk,  (B.l) 
yields 


nF(h)(x)=J[ 


li^x<r[F(r)-F(l) 


h{r)  +  + 


where  0(x)  =  7To(l  —  GT(x))h(x).  Moreover,  {l  <  x —  <r}  =  {l  <  x  <r}  and 

AF=  f 

Jl< 


F(x)-F(l)  F(r)-F(x) 

Mr)  +  lpr^  ri/nh(l)]aG  ( l,r ) 


l<x<rF(r)~F(l)  v  '  F(r)-F(l) 

F{x-)-F(l)L,_,  ,  F(r)-F(x~) 
«*</  f(r)-F(0 


■X 

-/ 

Jl<X—7 


l<x<r  F(r)  F(l)  J  i-.x<r 

,F(x-)-F(l)Mx)  +  F^x~^{FX~h(l)}dG-(l,  r)  +  0(x)  -  /?(*-)• 


n<x=r  F(x)-F(l) 

Replacing  F  and  h  by  Fk  and  1(2>0.),  respectively,  equation  (B.5)  yields 


AFk  >  f  h(x)dG*(l,r)  —  f  h(x)dG*{l,r)  +  0(x)  -  0(x~) 

Jl=x<r  J  l<x=r 

=n0P(T  =  x)h(x)  4-  7To(l  -  Gr(x))h(x)  -  7r0(l  -  Gr{x-))h{x-) 
=7T0(1  -  GT{x-)){h{x)  -  h(x-)) 

>0,  which  is  (B.2).  □ 


(B.5) 
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Abstract 

We  consider  efficient  estimation  of  a  distribution  function  F  of  a  random  variable  X  with 
doubly-censored  data.  The  double  censorship  model  assumes  that  X  and  the  random  vector 
(Z,  Y )  are  independent  and  Z  <Y  with  probability  one,  and  that  X  is  uncensored  if  Z  <  X  < 
Y ,  right  censored  if  Y  <  X  and  left  censored  if  X  <  Z.  Let  K(x)  =  P(Z  <  x  <  Y)  and  let 
B  =  {x  \  K(x—)  —  0,  F(x)  >  0  and  F(x— )  <  1}.  Under  the  assumption  P(A  e  B)  =  0,  we 
present  an  example  that  the  generalized  maximum  likelihood  estimator  (GMLE)  of  F  with 
doubly-censored  data  is  not  asymptotically  normally  distributed  and  is  not  asymptotically 
efficient,  and  we  propose  a  modified  GMLE.  We  conjecture  that  it  is  asymptotically  normally 
distributed  and  asymptotically  efficient  under  the  assumption  P(X  G  B)  =  0.  We  give  a  proof 
under  an  additional  assumption. 

1  Partially  supported  by  DAMD17-94-J-4332. 
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1.  Introduction 

We  consider  efficient  estimation  of  a  survival  function  with  doubly-censored  data.  Let 
Xi,  X2, . . . ,  Xn  be  i.i.d.  copies  from  a  random  survival  time  X,  with  a  distribution  function  F. 
Let  (Zi,Yi),  (Z2,  Y2))  (Zn,Yn)  be  i.i.d.  copies  from  a  random  vector  (Z,Y),  where  Z  <Y 
with  probability  one.  Assume  that  X  and  ( Z ,  Y)  are  independent.  For  each  i,  1  <  i  <  n,  X{ 
is  either  observed  if  Zi  <  X(  <Yi,  or  right  censored  if  Xi  >  Yi,  or  left  censored  if  Xi  <  Z*. 
Thus  the  observation  can  be  represented  by  a  random  interval  J,  where 

([X,X\  if  Z<X<Y, 

T=My,oo)  ify<x,  (1.1) 

{  (-00,  Z]  if  X  <  Z. 

This  censoring  scheme  is  called  a  double  censorship  model  (DC  model). 

Doubly-censored  data  often  arise  in  biomedical  studies,  reliability  research,  and  many 
other  fields.  Examples  of  doubly-censored  data  can  be  found  in  Leiderman  et  al  (1973), 
Samuelsen  (1989)  and  Kim,  De  Gruttola  and  Lagakos  (1993). 

Turnbull  (1974)  proposes  the  generalized  maximum  likelihood  estimator  (GMLE)  of  F 
with  doubly-censored  data,  and  shows  that  the  GMLE  is  a  self-consistent  estimator  (SCE). 
Turnbull  (1974),  Chang  and  Yang  (1987),  Chang  (1990)  and  Gu  and  Zhang  (1993)  show  that 
the  SCEs  are  consistent,  asymptotically  normally  distributed  and  asymptotically  efficient 
under  certain  regularity  conditions.  Denote 
K{x)  =P  (Z  <x<Y), 

Pc(x)  =  P{X  is  not  censored|X  =  2;}, 

Q  =  { x  :  F(x)  >  0  and  F(x—)  <  1}, 

B  =  {x  :  K(x- )  =  0,  x  €  Q}. 

Turnbull  (1974)  assumes  all  random  variables  takes  on  finitely  many  values.  Gu  and  Zhang 
(1993)  make  weaker  assumptions,  with  a  key  assumption 

K(x—)  =  Pc{x)  >  0  for  all  x  E  Q.  (1.2) 

Let  Cl  be  the  sample  space  and  let  Op  =  {x  \  x  =  X(u>)  for  some  ui  E  fl}.  Assumption  (1.2) 
implies  that 

Op  D  Q  and  K(x-)  =  Pc(x )  for  all  x  E  Q,  (1.3) 
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which  is  not  true  for  a  discrete  random  variable  X.  The  condition  really  needed  in  the  proofs 
of  Gu  and  Zhang  (1993)  is 

K(x-)  >  0  for  all  xeQ,  (1.4) 

rather  than  (1.2).  (1.4)  is  weaker  than  (1.2),  as  it  does  not  imply  (1.3). 

A  sufficient  condition  for  F  to  be  identifiable  on  the  whole  real  line  is  P  (X  £  B)  =  0, 
since  all  SCEs  are  consistent  under  this  assumption  (Yu  and  Li  (1998)).  It  is  easy  to  see 
P(X  G  B)  =  0  is  weaker  than  (1.4)  since  (1.4)  implies  that  B  is  empty,  thus  P(I  6  B)  =  0. 

An  interesting  case  of  a  nonempty  B  is  that  B  is  a  discrete  set.  In  such  a  case,  we  have 
P(X  £  B)  =  0  if  F  is  continuous.  Let  ( Zij,yij ),  i  £  K\  and  j  £  K2,  be  all  the  possible  values 
of  (Z,Y),  where  K\  and  K2  are  two  index  sets.  Then  B  =  Q  \  (Zij  1  yij\.  Notice  that  if 
(z^ ,  yij )  =  (i,i  +  1  —  l/j),  i  >  1  and  j  >  2,  then  B  is  a  discrete  set  of  all  positive  integers. 
Here  “\”  stands  for  set  minus.  In  a  follow-up  study,  Z  stands  for  the  age  of  a  patient  at 
the  enrollment  and  Y  the  age  at  the  termination  of  the  study.  Thus  it  is  possible  that  in  a 
follow-up  study  B  is  a  nonempty  discrete  set  with  P(X  £  B)  =  0,  as  it  is  reasonable  to  assume 
that  the  lifetime  distribution  is  continuous. 

If  P(X  £  B)  =  0,  in  general,  the  GMLE  of  F  is  not  asymptotically  normally  distributed 
and  is  not  asymptotically  efficient  (see  Section  3).  We  propose  a  modified  GMLE  and  show 
that  it  is  efficient  under  an  additional  assumption.  We  conjecture  that  it  is  still  asymptotically 
normally  distributed  and  asymptotically  efficient  under  the  assumption  P(I  G  B)  =  0. 

The  organization  of  the  current  manuscript  is  as  follows.  The  modified  GMLE  is  proposed 
in  Section  2.  In  Section  3  we  present  an  example  such  that  the  GMLE  is  not  asymptotically 
normally  distributed  and  is  not  asymptotically  efficient  but  the  modified  GMLE  is.  In  Section 
4,  we  make  some  comments. 

2.  Modified  GMLE 

Let  (L,FL)  be  the  endpoints  of  the  random  interval  X  in  (1.1).  Let  1),  i  =  l,...,n,  be  a 
random  sample  from  J,  with  endpoints  (Lj,I?j).  We  call  a  nonempty  finite  intersection  B  of 
Ij’s  an  innermost  interval  (II)  if  B  fl  Ik  —  B  or  0  for  all  k.  Let  B\,  ...,  Bm  be  all  the  distinct 
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innermost  intervals  induced  by  these  A’s  and  assume  that  x  <  y  for  each  x  £  Sj_i  and  each 
y  £  Bi,  i  =  2,  An  innermost  interval  A  is  called  a  modified  innermost  interval  (m-II) 

if  it  is  either  a  singleton  set  or  B\  or  Bm-  Let  be  all  the  distinct  m-IIs  and  assume 

that  x  <  y  for  each  x  £  A;_i  and  each  y  £  Ai,  i  =  2, m. 

The  modified  GMLE  (m-GMLE),  F,  of  F  is  defined  by 


F{x)  =  Y 

AjC(—oo,x] 

where  (sj,  ...,sm)  maximizes  the  modified  likelihood  function 

n  m 

£(si, sm)  =  Y  sj1(Aj  c  !i),  where  sj  >  0  and  Y%Li  si  =  1-  (2-1) 

i=l  j~l 


Here  l(-)  is  the  indicator  function.  The  m-GMLE  can  be  derived  by  an  iterative  procedure 
as  follows.  At  step  1,  let  =  1/m  for  j  =  1, ...,  m.  At  step  h, 


f  >  =  E 


1  1  (Aj  c  Ii)sf  1} 

(fc-i)-’ 


=inEZiMAkcii)s): 


j  —  1, ...,  m,  and  h>  2. 


Stop  at  convergence  and  the  limit  lim^-^oo  is  the  m-GMLE  of  Sj. 

Thus  the  m-GMLE  redistributes  the  mass  among  uncensored  observations,  or  the  interval 
(-00 ,R(i)}  if  (— oo, R(i))  =  ( Li,Ri )  for  some  i  £  {l,...,n},  or  (L(n),oo)  if  (L(n),+cso)  = 
( Li,Ri )  for  some  i  £  {1,  where 


R( i)  =  min {Ri  :i  =  1, ..., n }  and  L (n)  =  max{Li  :i  =  1, n}.  (2.2) 

Denote  U  =  max  Aj,  i  =  1,  ...,  m  —  1.  Denote  =  1  (Aj  C  £)  and  s  =  (si, Sto-i)*,  where 
s*  is  the  transpose  of  s.  Let  A  be  the  (m  —  1)  x  (m  —  1)  dimensional  empirical  information 
matrix  with  the  (i,i)th  entry 


n  ^  m 

Y  -(^  -  8hm)(6hJ  -  Shm)/(Y  shkh)2. 

h—l  U  k= 1 
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Note  that  F(tk)  =  X^i  %  =  e&s,  where  e*  is  a  (m  —  1)  x  1  vector  with  the  first  k  entries 
being  unity  and  the  others  all  zero,  k  =  1,  ...,  m  —  1,  an  estimator  of  the  variance  of  F(tk)  is 


&l(tk)  =  l&k/n.  (2.3) 

Recall  that  the  GMLE  F  of  F{x)  can  be  obtained  by  F(x)  =  X!bjC(-oo,x]  f°r  all  x, 
where  (t&i,  maximizes  the  generalized  likelihood  function 

n  M 

h  =  h(w  1,..,%)  =  niE1^  C  Ii)wj\,  with  Wi  >  0  and  =  (2.4) 

i— 1  j  i=l 

Turnbull  (1974)  shows  that  the  GMLE  of  (w\  is  a  solution  to  the  self-consistent 

equation 


Vj  =  -  j  —  i  M,  W{  >  0  and  Wi  =  1.  (2.5) 

S » Ef=i  i(2fc  c  "  S 


A  solution  (wi,...,wm)  to  (2.5)  is  called  an  SCE  of  (wi,  and  an  estimator  Fi(x)  = 

SbjC(-oo,®]  called  an  SCE  of  F(x)  if  (wi,  is  an  SCE  of  (w\,  Both  the 

GMLE  and  the  m-GMLE  are  SCEs.  These  two  estimators  are  the  same  when  the  GMLE  puts 


zero  mass  on  all  the  innermost  intervals  which  are  not  m-IIs.  In  general,  they  are  different. 

Under  the  assumption  F(X  €  B)  =  0  and  an  additional  assumption  that  X  takes  on 
finitely  many  values,  say  x\, ...,  xm,  we  can  show  that  the  m-GMLE  is  efficient.  Let  A{  =  {xi} 
and  s°  =  P(X  =  Xi)  i  =  1,  ...,  m.  Then  <>0.  Under  the  above  assumptions,  with  probabil¬ 
ity  one,  for  n  large  enough,  the  random  sample  contains  all  the  AjS.  In  view  of  the  likelihood 
function  (2.1),  the  problem  reduces  to  parametric  estimation  of  a  multinomial  distribution 
function  with  parameter  s.  The  m-GMLE  of  s  is  the  MLE  of  s  in  this  parametric  estimation 
problem.  Since  s°  >  0  for  all  i,  by  the  standard  large  sample  theory  (see  e.g.,  Ferguson  (1996)), 
the  MLE  of  s  is  consistent,  asymptotically  normally  distributed  and  asymptotically  efficient. 
The  asymptotic  covariance  matrix  can  be  estimated  by  the  sample  information  matrix  A-1. 
This  justifies  the  use  of  formula  (2.3).  An  explicit  form  of  the  inverse  of  the  information 
matrix  of  a  self-consistent  estimator  is  given  in  Turnbull  (1974).  Since  the  m-GMLE  is  also  a 
self-consistent  estimator,  the  formula  is  applicable  to  the  m-GMLE. 
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Remark  We  conjecture  that  the  m-GMLE  is  asymptotically  normally  distributed  and  asymp¬ 
totically  efficient  if  P(X  e  0)  =  0.  The  above  paragraph  confirms  it  with  the  additional 
assumption  that  Op  is  finite.  We  can  further  prove  the  conjecture  under  the  additional  as¬ 
sumption  that  Op  consists  of  isolated  points  or  B  is  a  union  of  mutually  disjoint  intervals 
(ui,Uj]s.  We  decide  not  to  present  the  latter  proof  but  refer  them  to  a  technical  report  (Yu 
and  Wong  (1998)),  as  it  is  not  as  short  as  the  above  paragraph  but  still  needs  an  additional 
assumption. 


3.  A  Simple  Example 

We  now  give  an  example  that  the  GMLE  F  of  F(x)  is  not  asymptotically  efficient  and 
\fn(F  —  F)  does  not  converges  in  distribution  to  a  Gaussian  process,  but  the  m-GMLE  does. 

Suppose  that  in  a  DC  model,  P((Z,Y)  G  {(0.5, y)  :  y  G  [2,3)})  =  gi  and  P((Z,Y)  G 
{(z, 8)  :  2  G  (3,4]})  =  g2,  where  pi  +  g2  =  1;  F(x)  =  p\l{x  >  1)  +  p2l(x  >  5),  where  p\ 
and  P2  >  0.  Then  ( L,R )  takes  values  (1,1),  (5,5),  (— oo,y)  and  (2,00),  where  y  G  (3,4]  and 
2  G  [2, 3).  Given  a  random  sample  of  size  n  from  ( L,R ),  there  are  Ni  (1,  l)’s,  N2  (5, 5)’s, 
intervals  of  form  (— oo,y)’s  and  N4  intervals  of  form  (2,  +00) ’s. 

Note  that  in  this  case  assumption  (1.4)  is  violated,  as  Q  —  [1, 5]  (see  (1.4))  but  K( 3—)  = 
P(Z  <  3  <  Y)  =  0;  however,  P(X  G  B)  =  0  as  B  =  {3}. 

We  now  derive  the  GMLE  and  the  m-GMLE.  With  probability  one,  if  n  is  large  enough, 

the  innermost  intervals  are  [1,1],  ( y0,z0 ]  and  [5,5]  and  the  m-IIs  are  [1,1]  and  [5,5],  where 

ya  is  the  largest  LjS  among  all  Li  <  3  and  z0  is  the  smallest  Ri  among  all  R{  >  3.  Let 

U  =  N* - Mi — 

Un  N2+N3  N1+N4  ’ 


=  whrXx  s  x) + UMx  * 3) + 1{X  s  5)1 

F(x)  =  —————— T (2?  >  1)  +  N*  +  N41{x  >  5), 

n  n 

F(x)  =  Fi(*)l(l7n  >  0)  +  F(x)l(Un  <  0). 


(3.1) 


Verify  that  F  and  F  are  the  GMLE  and  m-GMLE  of  F,  respectively.  It  follows  from  the 
strong  law  of  large  number  (SLLN)  that  the  three  estimators  in  (3.1)  are  all  consistent.  Note 
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that  JVi  +N3  has  a  binomial  distribution  Bin(n,F( 2)).  Thus  F(x)  is  asymptotically  efficient 
for  all  x ,  and  y/n(F  —  F)  converges  in  distribution  to  a  Gaussian  process. 

Let  p  =  F( 4)  —  F(2),  then  the  m-GMLE  of  p  is  p  =  F(4)  —  F( 2)  and  the  GMLE  of  p  is 
p  =  F(4)  —  F(2).  In  order  to  show  that  the  GMLE  F  is  not  efficient  and  \/n(F  —  F)  does 
not  converges  in  distribution  to  a  Gaussian  process,  it  suffices  to  show  p  is  not  asymptotically 
efficient  and  is  not  asymptotically  normally  distributed.  This  is  done  next. 

Note  that  p  =  0,  thus  var(p)  =  0.  However,  p  =  Unl(Un  >  0)  and  \JnUn  converges 
in  distribution  to  U,  a  normal  random  variable  with  mean  0  and  standard  deviation  a  >  0, 
which  can  be  obtained  by  the  delta  method.  That  is,  var(p)  <  var(p).  Thus  the  GMLE  p  is 
not  asymptotic  efficient.  Moreover,  \fnUnl{Un  >  0)  converges  in  distribution  to  U1(U  >  0), 
which  is  not  a  normal  random  variable. 

4.  Discussion 

Under  assumption  (1.4)  and  some  additional  assumptions  made  in  Gu  and  Zhang  (1993), 
both  the  GMLE  and  the  m-GMLE  have  the  same  asymptotic  properties  as  both  of  them  are 
SCEs.  If  P(X  e  B)  =  0,  then  both  of  them  are  uniformly  strongly  consistent  (see  Yu  and  Li 
(1998)). 

The  m-GMLE  has  two  advantages  over  the  GMLE.  Under  the  assumption  P(X  €  B)  =  0, 
the  GMLE  is  not  efficient  but  we  conjecture  that  the  m-GMLE  is.  In  the  end  of  Section 
2,  the  conjecture  is  confirmed  in  the  case  that  X  takes  on  finitely  many  values  but  the 
censoring  vector  can  be  arbitrary.  In  application,  there  is  a  computational  feasibility  problem 
in  obtaining  the  GMLE  using  the  self-consistent  algorithm  if  the  sample  size  is  large.  It  is 
then  desirable  to  reduce  the  number  of  parameters  to  be  estimated.  The  second  advantage  of 
the  m-GMLE  over  the  GMLE  is  that  it  has  less  parameters  to  estimate. 

When  P(Y  €  B)  >  0,  F  is  not  identifiable  on  [0,  +00).  Thus  both  the  GMLE  and  the  m- 
GMLE  are  not  consistent  on  [0,  +00).  However,  the  GMLE  is  consistent  at  each  observation, 
whereas  the  m-GMLE  is  not.  Thus  when  the  GMLE  assigns  to  an  II  which  is  not  an  m- 
II  a  mass  which  is  about  the  same  as  the  mass  to  an  m-II,  it  may  be  an  indication  that 
P(X  6  B)  >  0.  In  such  a  case,  it  is  better  to  use  the  GMLE.  However,  we  do  expect  that  a 
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